OLCF Best Practices (and More) Bill Renaud OLCF User Assistance Group
|
|
|
- Jared Nash
- 10 years ago
- Views:
Transcription
1 OLCF Best Practices (and More) Bill Renaud OLCF User Assistance Group
2 Overview This presentation covers some helpful information for users of OLCF Staying informed Some aspects of system usage that may differ from your past experience Some common error messages Common questions/other tips on using the systems This is by no means an all-inclusive presentation Feel free to ask questions 2
3 Staying Informed OLCF provides multiple layers of user notifications about system status and downtimes OLCF Weekly Update Status indicators on olcf.ornl.gov Opt-in lists (subscription management currently unavailable) Twitter StatusCast (coming soon) A summary of these items can also be found at 3
4 Staying Informed-Weekly Update Sent weekly (Thursday/Friday) Contains several items Announcements about upcoming training Announcements about upcoming system changes Planned outages for the next week All OLCF users should receive this Let us know if you re not receiving it! 4
5 Staying Informed-System Status Automated scripts parse logs from our monitoring software and make an educated guess as to system state This status is then sent to multiple destinations: websites, Twitter, and lists While this is fairly accurate, it is a fully automated process Thus, there is a possibility of both false positives and false negatives We do take some measures to mitigate this 5
6 6 System Status:
7 System Status: users.nccs.gov 7
8 System Status: 8
9 System Status: Opt-In Lists We also send status up/down notices via These are available on an opt-in basis See Subscribe only to lists of systems of interest to you Other notices are sent to these lists, so you may want to sign up (once this is re-enabled) 9
10 System Status: Opt-In Lists 10
11 System Status: StatusCast Users have told us via our survey that they d like more status information online Work is ongoing on a new tool to provide that information Current status Planned outages Previous outages Announcements Stay tuned to the OLCF Weekly Update for more information 11
12 System Status: Smartphone Apps 12 System status apps have been developed Android: OLCF StatusApp iphone: OLCF System Status Choose systems to monitor Automated notifications of system changes Usage instructions on olcf.ornl.gov Currently unavailable due to app maintenance
13 Using the Systems at OLCF Software Compiling Common Error Messages Common Questions Other Tips/Best Practices 13
14 Finding Software Some software is part of the default environment Basic commands Text editing utilities Software is typically managed via the modules utility Software is actually installed in /sw! To list available software, use module avail To use a package, use module load More information is available on the OLCF website Important items, including compilers, MPI libraries, and GPU libraries, are also available via modules 14
15 Software Installation/Updating We are moving to a model of updates to software packages at certain intervals (or in the case of major revisions) This means not all minor versions will be installed We ll move towards adding build instructions on the website so that you can build minor revisions/slightly different versions Look for information on the OLCF website and via Weekly Update s 15
16 Software Installation You are free to install software in your directories (including your project directory) Subject to terms of license agreements, export control laws, etc. If you think a piece of software would be of general interest, you might ask us to install it for general use Preferred method: but to [email protected] works, too. This will be reviewed by our software council 16
17 Compiling for the XK7 The compilers on the XT/XE/XK line of systems may differ (significantly) from your previous experience Combination of?-asyncpe and PrgEnv-? modules?-asyncpe provides compiler wrapper scripts PrgEnv-? loads modules for compilers, math libraries, MPI, etc. Regardless of actual compiler being used (PGI, Intel, GNU), invoke with cc, CC, or ftn! MPI, math, and scientific libraries included automatically No -lmpi, -lscalapack, etc. This can be challenging when dealing with some build processes 17
18 Compiling for the XK7 You are actually cross-compiling processors (& instruction sets) differ between login and compute nodes It is very important to realize this utilities like configure often depend on being run on the target architecture, so they can be challenging to use on the XK7 Compiling for login/batch nodes is occasionally necessary There are three ways to do this module swap xtpe-interlagos xtpe-target-native Add target=native to cc/cc/ftn! Call the compilers directly (e.g. pgcc, pgf90, ifort, gcc) 18
19 Common Runtime Errors Illegal Instruction! A code was compiled for the compute nodes but executed on login nodes request exceeds max nodes alloc! The number of cores required to satisfy the aprun command exceeds the number requested! Node allocation should be more intuitive now that we re using #PBS l nodes=n to request resources (as compared to #PBS l size=n)! Also happens when your request is correct, but at launch time a node is discovered to be down Quick fix: request extra nodes & include logic to restart in your batch script 19
20 Common Runtime Errors relocation truncated to fit: R_X86_64_PC32! The static memory used by your code exceeds what s allowed by the memory model you re using Only the small memory model is available (static size >= 2GB) Solution: use dynamic memory allocation to the greatest extent possible 20
21 Common Runtime Errors aprun: [NID 94]Exec a.out failed: chdir /autofs/na1_home/user1 No such file or directory! When you run aprun, you must do so from a directory visible to compute nodes (we ll discuss this in a few slides) As a general rule, lustre directories are visible and others are not When the job starts, it will attempt to cd into the working directory from which you submitted aprun In the example above, aprun was executed from the user s home directory Any files used by the processes running on compute nodes (input, output,.so, etc) must be in directories visible to the compute node. This is not the case for executables they can be in your home directory 21
22 Common Questions Why isn t my job running? Can be a number of reasons Insufficient system resources (nodes) available Pending downtime (reservation) Queue policy Dependency (on another job) not met Use checkjob <jobid> to diagnose Can add v for verbose output Produces lots of data, but clues to jobs not running are typically near the end of the output Use showres to see upcoming reservations (outages) Also shows all running jobs, so you have to look carefully 22
23 Common Questions Why isn t my job running? Insufficient resources available NOTE: job cannot run (insufficient available procs: 3264 available) Unresolved dependency NOTE: job cannot run (dependency jobcomplete not met)! Queue policy issue NOTE: job violates constraints for partition titan (job violates active HARD MAXJOB limit of 2 for qos smallmaxjobs user partition ALL (Req: 1 InUse: 2))!! BLOCK MSG: job violates active HARD MAXJOB limit of 2 for qos smallmaxjobs user partition ALL (Req: 1 InUse: 2) (recorded at last scheduling iteration)! 23
24 Common Questions Why isn t my job running? There s a pending maintenance period/reservation This is example output from showres! Currently running jobs will also be listed; we re only interested in ReservationIDs that aren t integers (or integers followed by.1) There s a full system (18,688 node) reservation starting in 15 hours am I asking for >15 hours of walltime? ReservationID Type S Start End Duration N/P StartTime!! DTNOutage.127 User - 15:05:13 1:03:05:13 12:00:00 11/176 Tue Jan 28 08:00:00! PM.128 User - 15:05:13 1:03:05:13 12:00: / Tue Jan 28 08:00:00! PM_login.129 User - 15:05:13 1:03:05:13 12:00:00 16/2048 Tue Jan 28 08:00:00! 24
25 Common Questions When will my job start? Can be difficult to tell, since it can change based on future jobs submitted to the queue showstart <jobid> is not always reliable showq output is sorted by priority mdiag -p can give detailed priority information Shows boosts (or lack thereof) based on job size Shows overallocation penalties 25
26 Common Questions What project am I on, and what s its allocation? Use showproj to list your projects Use showusage to display utilization Both commands have a help option run them with h for usage info $ showproj!! User1 is a member of the following project(s) on titan:! stf007!! $ showusage!! titan usage for the project's current allocation period:! Project Totals user1! Project Allocation Usage Remaining Usage!! stf ! stf007de1 0 0! 26
27 Common Questions What happens when my project overruns its allocation? Most importantly, we do not disable the project jobs simply run at lower priority Jobs submitted to projects slightly over allocation ( %) receive a 30- day priority reduction Jobs submitted to projects well over allocation (>125%) receive a 365-day priority reduction This allows a degree of fairshare while still allowing people to run when the system is quiescent 27
28 Common Questions My project has lost X hours due to system issues can I get that time reimbursed? Since we don t disable projects for going over allocation, we also don t deal with refunds per se If many jobs are affected, the priority reduction can be delayed. This is basically a refund but is much easier to manage. 28
29 Common Questions I changed permissions on /tmp/work/$user, but they changed back why? Permissions in the lustre filesystem are controlled by settings in our accounts database These settings only affect the top-level permission Permissions are automatically (re-)set regularly Most users can request they be changed Send to [email protected] Note that you need to us to change them back Of course, you can always just chmod everything under the top-level directory We can t change permissions on directories associated with sensitive data 29
30 Common Questions Can I get a priority boost? higher walltime limit? purge exemption? larger home directory quota, etc? Perhaps See (Forms to Request Changes to Computers, Jobs or Accounts) Once submitted, forms are sent to the Resource Utilization Council for approval Make request well in advance it can be difficult to make last-minute changes If requesting job priority, be sure the job is submitted (the queue may move more quickly than expected, eliminating the need for the request) 30
31 Common Questions Is my data backed up? NFS directories: Yes, a limited backup exists in ~/.yesterday/ $ cd ~/SomeDir/SomeOtherDir! $ ls! file1 file2! $ rm file1! $ ls! file2! $ cp ~/.yesterday/somedir/someotherdir/file1.! $ ls! file1 file2!! Lustre directories: No HPSS: No. You might use it as a backup of your directories, HPSS itself is not backed up. It s a good idea to have another level of backup at some other site. 31
32 Common Questions Where do I archive my data? Is there a mass storage system available? HPSS Accessed via hsi & htar! See the OLCF presentation for details 32
33 Common Questions How do I transfer my data? If the main system is down, can I access my data? System user guides (on discuss data transfer A Data User Guide is coming soon The Data Transfer Nodes (dtn.ccs.ornl.gov) are the preferred place for data movement (both internal and external) Almost always available, even during outages of major resources All users should have access 33
34 Common Questions 34 Where do I store data? Home Directory (NFS) User-specific and project-specific areas Scratch/Work Directories (Lustre) There are no user scratch areas everything s part of a project Not shared with anyone else ($MEMBERWORK/[projid]) Shared with members of your project ($PROJWORK/[projid]) Shared with all users ($WORLDWORK/[projid]) Archival Storage (HPSS) User-specific and project-specific areas The storage hierarchy can be a bit complex see for a detailed description
35 Common Questions How do I check my disk usage? For home directories, use the quota command: $ quota! quota: error while getting quota from master:/cm/shared for user1 (id 98765): Connection refused! quota: error while getting quota from master:/home for user1 (id 98765): Connection refused! Disk quotas for user user1(uid 98765):! Filesystem blocks quota limit grace files quota limit grace! nccsfiler3.ccs.ornl.gov:/vol/home1! ! nccsfiler4.ccs.ornl.gov:/vol/home2! ! For lustre directories, use lustredu: $ lustredu $MEMBERWORK/stf007! Last Collected Date Size File Count Directory! :30: MB 9 /lustre/atlas1/stf007/scratch/user1! 35
36 Common Questions Where do I find user guides and other documentation? CrayDoc - Nvidia hosted documentation 36
37 Data Storage Practices HPSS is the proper location for long-term storage Project areas offer a common area for files (executables, data, etc.) shared among project members, but should not be considered long-term storage Need to keep an eye on disk usage Should still be backed up May be purged (purge policy is described on the data management policy page mentioned earlier) 37
38 Lustre/Scratch Filesystem Remember that the current version of lustre has a single metadata server for each filesystem Codes may need revisions to prevent opening large numbers of files simultaneously Codes using.so files can do this and should be avoided if possible Compiling in lustre can be slow When possible, compile in NFS There is a purge, but if you delete files when not needed it will help 38
39 39 Dealing With the Scratch Purge- Conditional Transfers Many codes use files from previous iterations of the code, and those files may get deleted by the scratch purge This can present challenges Pulling from HPSS every time is inefficient Multiple scripts ( data in place, data not in place ) are cumbersome touch isn t always ideal Conditional transfers help with this #!/bin/bash!...! if [[! a $MEMBERWORK/stf007/some_important_file ]]; then! hsi -q get /home/user1/data/some_important_file! fi!! aprun -n 4096./a.out!...!
40 Interacting with HPSS HPSS is a somewhat complex system HPSS prefers a small number of large files and not a large number of small files-htar is your friend in this regard htar is (much) faster than a tar followed by hsi put! Limited disk space is no problem data is streamed directly to HPSS so there is no intermediate local storage Running many transfers at a time can be problematic Multiple transfers may not give you parallelism Limiting the number of per-user transfers helps the system operate more efficiently (& therefore can be more efficient for you) Usage examples are on the OLCF web site 40
41 Running Jobs at OLCF Batch job information is available on the OLCF Web Site Since we are designated as a leadership-class facility, queuing policy heavily favors large jobs Remember that jobs are charged based on what s made unavailable to others and not what you use The system cannot allocate a single node to multiple aprun instances (much less multiple users) Requests for high priority/quick turnaround are considered Submit your job ASAP it may start more quickly than expected Allow plenty of lead time when making a request discussion may be necessary prior to a decision on approval 41
42 Running Jobs at OLCF From a user perspective, Titan has three major parts The system proper External login nodes MOAB server Often, only the system proper is affected by outages External login nodes and the MOAB server node remain up This means you can compile/submit jobs/etc while titan is down Jobs will be queued and will run when the system returns 42
43 Debugging/Optimization at OLCF Will be discussed in more depth in a few minutes, but Several software tools are provided for debugging and optimizing your applications DDT Vampir CrayPAT Information on these tools is available on the web; you can also contact the OLCF User Assistance Center if you have questions 43
44 Support Best Practices Send as many error messages as possible Or, place them all in a file and direct us to it When sending code, create a.tar file & direct us to it Include all files necessary to build/run the code More efficient than sending through When possible, reduce error to a small reproducer code We can assist with this If the error has to go to the vendor, they ll want this Send new issues in new tickets, not replies to old ones 44
45 OLCF User Survey Once a year (usually final quarter of the calendar year) we conduct a user survey Mixture of 1-5 ratings and open-ended questions We greatly value your responses Open-ended responses are responsible for several center initiatives We can t fix a problem if we don t know it s there! 45
46 Finally We re here to help you Questions/comments/etc. can be sent to the OLCF User Assistance Center 9AM 5PM Eastern, Monday-Friday exclusive of ORNL holidays (865)
8/15/2014. Best Practices @OLCF (and more) General Information. Staying Informed. Staying Informed. Staying Informed-System Status
Best Practices @OLCF (and more) Bill Renaud OLCF User Support General Information This presentation covers some helpful information for users of OLCF Staying informed Aspects of system usage that may differ
OLCF Best Practices. Bill Renaud OLCF User Assistance Group
OLCF Best Practices Bill Renaud OLCF User Assistance Group Overview This presentation covers some helpful information for users of OLCF Staying informed Some aspects of system usage that may differ from
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster
Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959
High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina
High Performance Computing Facility Specifications, Policies and Usage Supercomputer Project Bibliotheca Alexandrina Bibliotheca Alexandrina 1/16 Topics Specifications Overview Site Policies Intel Compilers
Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015
Work Environment David Tur HPC Expert HPC Users Training September, 18th 2015 1. Atlas Cluster: Accessing and using resources 2. Software Overview 3. Job Scheduler 1. Accessing Resources DIPC technicians
Research Technologies Data Storage for HPC
Research Technologies Data Storage for HPC Supercomputing for Everyone February 17-18, 2014 Research Technologies High Performance File Systems [email protected] Indiana University Intro to HPC on Big
High Performance Computing at the Oak Ridge Leadership Computing Facility
Page 1 High Performance Computing at the Oak Ridge Leadership Computing Facility Page 2 Outline Our Mission Computer Systems: Present, Past, Future Challenges Along the Way Resources for Users Page 3 Our
The Moab Scheduler. Dan Mazur, McGill HPC [email protected] Aug 23, 2013
The Moab Scheduler Dan Mazur, McGill HPC [email protected] Aug 23, 2013 1 Outline Fair Resource Sharing Fairness Priority Maximizing resource usage MAXPS fairness policy Minimizing queue times Should
Streamline Computing Linux Cluster User Training. ( Nottingham University)
1 Streamline Computing Linux Cluster User Training ( Nottingham University) 3 User Training Agenda System Overview System Access Description of Cluster Environment Code Development Job Schedulers Running
HPC at IU Overview. Abhinav Thota Research Technologies Indiana University
HPC at IU Overview Abhinav Thota Research Technologies Indiana University What is HPC/cyberinfrastructure? Why should you care? Data sizes are growing Need to get to the solution faster Compute power is
Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014
Using WestGrid Patrick Mann, Manager, Technical Operations Jan.15, 2014 Winter 2014 Seminar Series Date Speaker Topic 5 February Gino DiLabio Molecular Modelling Using HPC and Gaussian 26 February Jonathan
NEC HPC-Linux-Cluster
NEC HPC-Linux-Cluster Hardware configuration: 4 Front-end servers: each with SandyBridge-EP processors: 16 cores per node 128 GB memory 134 compute nodes: 112 nodes with SandyBridge-EP processors (16 cores
Job Scheduling with Moab Cluster Suite
Job Scheduling with Moab Cluster Suite IBM High Performance Computing February 2010 Y. Joanna Wong, Ph.D. [email protected] 2/22/2010 Workload Manager Torque Source: Adaptive Computing 2 Some terminology..
Tutorial: Using WestGrid. Drew Leske Compute Canada/WestGrid Site Lead University of Victoria
Tutorial: Using WestGrid Drew Leske Compute Canada/WestGrid Site Lead University of Victoria Fall 2013 Seminar Series Date Speaker Topic 23 September Lindsay Sill Introduction to WestGrid 9 October Drew
Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research
Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St
Running applications on the Cray XC30 4/12/2015
Running applications on the Cray XC30 4/12/2015 1 Running on compute nodes By default, users do not log in and run applications on the compute nodes directly. Instead they launch jobs on compute nodes
Installation Manual v2.0.0
Installation Manual v2.0.0 Contents ResponseLogic Install Guide v2.0.0 (Command Prompt Install)... 3 Requirements... 4 Installation Checklist:... 4 1. Download and Unzip files.... 4 2. Confirm you have
Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems...
Job Scheduling Explained More than you ever want to know about how jobs get scheduled on WestGrid systems... Martin Siegert, SFU Cluster Myths There are so many jobs in the queue - it will take ages until
SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.
SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!
The RWTH Compute Cluster Environment
The RWTH Compute Cluster Environment Tim Cramer 11.03.2013 Source: D. Both, Bull GmbH Rechen- und Kommunikationszentrum (RZ) How to login Frontends cluster.rz.rwth-aachen.de cluster-x.rz.rwth-aachen.de
Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research
! Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research! Cynthia Cornelius! Center for Computational Research University at Buffalo, SUNY! cdc at
An Introduction to High Performance Computing in the Department
An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software
Parallel Debugging with DDT
Parallel Debugging with DDT Nate Woody 3/10/2009 www.cac.cornell.edu 1 Debugging Debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece
How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (
TECHNICAL GUIDELINES FOR APPLICANTS TO PRACE 7 th CALL (Tier-0) Contributing sites and the corresponding computer systems for this call are: GCS@Jülich, Germany IBM Blue Gene/Q GENCI@CEA, France Bull Bullx
SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.
SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!
Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.
Linux für bwgrid Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 27. June 2011 Richling/Kredel (URZ/RUM) Linux für bwgrid FS 2011 1 / 33 Introduction
Using Parallel Computing to Run Multiple Jobs
Beowulf Training Using Parallel Computing to Run Multiple Jobs Jeff Linderoth August 5, 2003 August 5, 2003 Beowulf Training Running Multiple Jobs Slide 1 Outline Introduction to Scheduling Software The
SLURM Workload Manager
SLURM Workload Manager What is SLURM? SLURM (Simple Linux Utility for Resource Management) is the native scheduler software that runs on ASTI's HPC cluster. Free and open-source job scheduler for the Linux
Martinos Center Compute Clusters
Intro What are the compute clusters How to gain access Housekeeping Usage Log In Submitting Jobs Queues Request CPUs/vmem Email Status I/O Interactive Dependencies Daisy Chain Wrapper Script In Progress
Cluster@WU User s Manual
Cluster@WU User s Manual Stefan Theußl Martin Pacala September 29, 2014 1 Introduction and scope At the WU Wirtschaftsuniversität Wien the Research Institute for Computational Methods (Forschungsinstitut
Hodor and Bran - Job Scheduling and PBS Scripts
Hodor and Bran - Job Scheduling and PBS Scripts UND Computational Research Center Now that you have your program compiled and your input file ready for processing, it s time to run your job on the cluster.
Miami University RedHawk Cluster Working with batch jobs on the Cluster
Miami University RedHawk Cluster Working with batch jobs on the Cluster The RedHawk cluster is a general purpose research computing resource available to support the research community at Miami University.
A Crash course to (The) Bighouse
A Crash course to (The) Bighouse Brock Palen [email protected] SVTI Users meeting Sep 20th Outline 1 Resources Configuration Hardware 2 Architecture ccnuma Altix 4700 Brick 3 Software Packaged Software
Biowulf2 Training Session
Biowulf2 Training Session 9 July 2015 Slides at: h,p://hpc.nih.gov/docs/b2training.pdf HPC@NIH website: h,p://hpc.nih.gov System hardware overview What s new/different The batch system & subminng jobs
Batch Scripts for RA & Mio
Batch Scripts for RA & Mio Timothy H. Kaiser, Ph.D. [email protected] 1 Jobs are Run via a Batch System Ra and Mio are shared resources Purpose: Give fair access to all users Have control over where jobs
INF-110. GPFS Installation
INF-110 GPFS Installation Overview Plan the installation Before installing any software, it is important to plan the GPFS installation by choosing the hardware, deciding which kind of disk connectivity
Introduction to Supercomputing with Janus
Introduction to Supercomputing with Janus Shelley Knuth [email protected] Peter Ruprecht [email protected] www.rc.colorado.edu Outline Who is CU Research Computing? What is a supercomputer?
PBS Tutorial. Fangrui Ma Universit of Nebraska-Lincoln. October 26th, 2007
PBS Tutorial Fangrui Ma Universit of Nebraska-Lincoln October 26th, 2007 Abstract In this tutorial we gave a brief introduction to using PBS Pro. We gave examples on how to write control script, and submit
Job scheduler details
Job scheduler details Advanced Computing Center for Research & Education (ACCRE) Job scheduler details 1 / 25 Outline 1 Batch queue system overview 2 Torque and Moab 3 Submitting jobs (ACCRE) Job scheduler
virtualization.info Review Center SWsoft Virtuozzo 3.5.1 (for Windows) // 02.26.06
virtualization.info Review Center SWsoft Virtuozzo 3.5.1 (for Windows) // 02.26.06 SWsoft Virtuozzo 3.5.1 (for Windows) Review 2 Summary 0. Introduction 1. Installation 2. VPSs creation and modification
The CNMS Computer Cluster
The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the
Cleaning Up Your Outlook Mailbox and Keeping It That Way ;-) Mailbox Cleanup. Quicklinks >>
Cleaning Up Your Outlook Mailbox and Keeping It That Way ;-) Whether you are reaching the limit of your mailbox storage quota or simply want to get rid of some of the clutter in your mailbox, knowing where
RA MPI Compilers Debuggers Profiling. March 25, 2009
RA MPI Compilers Debuggers Profiling March 25, 2009 Examples and Slides To download examples on RA 1. mkdir class 2. cd class 3. wget http://geco.mines.edu/workshop/class2/examples/examples.tgz 4. tar
A High Performance Computing Scheduling and Resource Management Primer
LLNL-TR-652476 A High Performance Computing Scheduling and Resource Management Primer D. H. Ahn, J. E. Garlick, M. A. Grondona, D. A. Lipari, R. R. Springmeyer March 31, 2014 Disclaimer This document was
SysPatrol - Server Security Monitor
SysPatrol Server Security Monitor User Manual Version 2.2 Sep 2013 www.flexense.com www.syspatrol.com 1 Product Overview SysPatrol is a server security monitoring solution allowing one to monitor one or
ERserver. iseries. Work management
ERserver iseries Work management ERserver iseries Work management Copyright International Business Machines Corporation 1998, 2002. All rights reserved. US Government Users Restricted Rights Use, duplication
Using the Yale HPC Clusters
Using the Yale HPC Clusters Stephen Weston Robert Bjornson Yale Center for Research Computing Yale University Oct 2015 To get help Send an email to: [email protected] Read documentation at: http://research.computing.yale.edu/hpc-support
Matlab on a Supercomputer
Matlab on a Supercomputer Shelley L. Knuth Research Computing April 9, 2015 Outline Description of Matlab and supercomputing Interactive Matlab jobs Non-interactive Matlab jobs Parallel Computing Slides
Getting Started with HPC
Getting Started with HPC An Introduction to the Minerva High Performance Computing Resource 17 Sep 2013 Outline of Topics Introduction HPC Accounts Logging onto the HPC Clusters Common Linux Commands Storage
Informationsaustausch für Nutzer des Aachener HPC Clusters
Informationsaustausch für Nutzer des Aachener HPC Clusters Paul Kapinos, Marcus Wagner - 21.05.2015 Informationsaustausch für Nutzer des Aachener HPC Clusters Agenda (The RWTH Compute cluster) Project-based
PBS + Maui Scheduler
PBS + Maui Scheduler This web page serves the following purpose Survey, study and understand the documents about PBS + Maui scheduler. Carry out test drive to verify our understanding. Design schdeuling
Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises
Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises Pierre-Yves Taunay Research Computing and Cyberinfrastructure 224A Computer Building The Pennsylvania State University University
LOCKSS on LINUX. Installation Manual and the OpenBSD Transition 02/17/2011
LOCKSS on LINUX Installation Manual and the OpenBSD Transition 02/17/2011 1 Table of Contents Overview... 3 LOCKSS Hardware... 5 Installation Checklist... 7 BIOS Settings... 10 Installation... 11 Firewall
WebSphere Business Monitor
WebSphere Business Monitor Administration This presentation will show you the functions in the administrative console for WebSphere Business Monitor. WBPM_Monitor_Administration.ppt Page 1 of 21 Goals
Using NeSI HPC Resources. NeSI Computational Science Team ([email protected])
NeSI Computational Science Team ([email protected]) Outline 1 About Us About NeSI Our Facilities 2 Using the Cluster Suitable Work What to expect Parallel speedup Data Getting to the Login Node 3 Submitting
Web Server (Step 1) Processes request and sends query to SQL server via ADO/OLEDB. Web Server (Step 2) Creates HTML page dynamically from record set
Dawn CF Performance Considerations Dawn CF key processes Request (http) Web Server (Step 1) Processes request and sends query to SQL server via ADO/OLEDB. Query (SQL) SQL Server Queries Database & returns
Metalogix SharePoint Backup. Advanced Installation Guide. Publication Date: August 24, 2015
Metalogix SharePoint Backup Publication Date: August 24, 2015 All Rights Reserved. This software is protected by copyright law and international treaties. Unauthorized reproduction or distribution of this
Optimization tools. 1) Improving Overall I/O
Optimization tools After your code is compiled, debugged, and capable of running to completion or planned termination, you can begin looking for ways in which to improve execution speed. In general, the
UNISOL SysAdmin. SysAdmin helps systems administrators manage their UNIX systems and networks more effectively.
1. UNISOL SysAdmin Overview SysAdmin helps systems administrators manage their UNIX systems and networks more effectively. SysAdmin is a comprehensive system administration package which provides a secure
Cloud Server. Parallels. An Introduction to Operating System Virtualization and Parallels Cloud Server. White Paper. www.parallels.
Parallels Cloud Server White Paper An Introduction to Operating System Virtualization and Parallels Cloud Server www.parallels.com Table of Contents Introduction... 3 Hardware Virtualization... 3 Operating
The Asterope compute cluster
The Asterope compute cluster ÅA has a small cluster named asterope.abo.fi with 8 compute nodes Each node has 2 Intel Xeon X5650 processors (6-core) with a total of 24 GB RAM 2 NVIDIA Tesla M2050 GPGPU
PCVITA Express Migrator for SharePoint (File System) 2011. Table of Contents
Table of Contents Chapter-1 ---------------------------------------------------------------------------- Page No (2) What is PCVITA Express Migrator for SharePoint (File System)? Migration Supported The
Parallel Processing using the LOTUS cluster
Parallel Processing using the LOTUS cluster Alison Pamment / Cristina del Cano Novales JASMIN/CEMS Workshop February 2015 Overview Parallelising data analysis LOTUS HPC Cluster Job submission on LOTUS
ZooKeeper Administrator's Guide
A Guide to Deployment and Administration by Table of contents 1 Deployment...2 1.1 System Requirements... 2 1.2 Clustered (Multi-Server) Setup... 2 1.3 Single Server and Developer Setup...4 2 Administration...
An introduction to compute resources in Biostatistics. Chris Scheller [email protected]
An introduction to compute resources in Biostatistics Chris Scheller [email protected] 1. Resources 1. Hardware 2. Account Allocation 3. Storage 4. Software 2. Usage 1. Environment Modules 2. Tools 3.
LOCKSS on LINUX. CentOS6 Installation Manual 08/22/2013
LOCKSS on LINUX CentOS6 Installation Manual 08/22/2013 1 Table of Contents Overview... 3 LOCKSS Hardware... 5 Installation Checklist... 6 BIOS Settings... 9 Installation... 10 Firewall Configuration...
Working with HPC and HTC Apps. Abhinav Thota Research Technologies Indiana University
Working with HPC and HTC Apps Abhinav Thota Research Technologies Indiana University Outline What are HPC apps? Working with typical HPC apps Compilers - Optimizations and libraries Installation Modules
PARALLELS CLOUD SERVER
PARALLELS CLOUD SERVER An Introduction to Operating System Virtualization and Parallels Cloud Server 1 Table of Contents Introduction... 3 Hardware Virtualization... 3 Operating System Virtualization...
Matrix Logic WirelessDMS Email Service 2.0
Matrix Logic WirelessDMS Email Service 2.0 Version 2.0 August 2009. WHAT IS WDMS EMAIL SERVICE?...2 FEATURES OF WDMS EMAIL SERVICE...3 HOW DOES WDMS EMAIL SERVICE WORK?...4 REQUIREMENTS...5 Server Prerequesites...5
RWTH GPU Cluster. Sandra Wienke [email protected] November 2012. Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky
RWTH GPU Cluster Fotos: Christian Iwainsky Sandra Wienke [email protected] November 2012 Rechen- und Kommunikationszentrum (RZ) The RWTH GPU Cluster GPU Cluster: 57 Nvidia Quadro 6000 (Fermi) innovative
System Copy GT Manual 1.8 Last update: 2015/07/13 Basis Technologies
System Copy GT Manual 1.8 Last update: 2015/07/13 Basis Technologies Table of Contents Introduction... 1 Prerequisites... 2 Executing System Copy GT... 3 Program Parameters / Selection Screen... 4 Technical
Caltech Center for Advanced Computing Research System Guide: MRI2 Cluster (zwicky) January 2014
1. How to Get An Account CACR Accounts 2. How to Access the Machine Connect to the front end, zwicky.cacr.caltech.edu: ssh -l username zwicky.cacr.caltech.edu or ssh [email protected] Edits,
Grid Engine Users Guide. 2011.11p1 Edition
Grid Engine Users Guide 2011.11p1 Edition Grid Engine Users Guide : 2011.11p1 Edition Published Nov 01 2012 Copyright 2012 University of California and Scalable Systems This document is subject to the
LANL Computing Environment for PSAAP Partners
LANL Computing Environment for PSAAP Partners Robert Cunningham [email protected] HPC Systems Group (HPC-3) July 2011 LANL Resources Available To Alliance Users Mapache is new, has a Lobo-like allocation Linux
Lab 2 : Basic File Server. Introduction
Lab 2 : Basic File Server Introduction In this lab, you will start your file system implementation by getting the following FUSE operations to work: CREATE/MKNOD, LOOKUP, and READDIR SETATTR, WRITE and
RingCentral Office@Hand from AT&T Desktop App for Windows & Mac. User Guide
RingCentral Office@Hand from AT&T Desktop App for Windows & Mac User Guide RingCentral Office@Hand from AT&T User Guide Table of Contents 2 Table of Contents 3 Welcome 4 Download and install the app 5
Introduction to SDSC systems and data analytics software packages "
Introduction to SDSC systems and data analytics software packages " Mahidhar Tatineni ([email protected]) SDSC Summer Institute August 05, 2013 Getting Started" System Access Logging in Linux/Mac Use available
Over-the-top Upgrade Guide for Snare Server v7
Over-the-top Upgrade Guide for Snare Server v7 Intersect Alliance International Pty Ltd. All rights reserved worldwide. Intersect Alliance Pty Ltd shall not be liable for errors contained herein or for
Batch Systems. provide a mechanism for submitting, launching, and tracking jobs on a shared resource
PBS INTERNALS PBS & TORQUE PBS (Portable Batch System)-software system for managing system resources on workstations, SMP systems, MPPs and vector computers. It was based on Network Queuing System (NQS)
Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma [email protected]
Debugging and Profiling Lab Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma [email protected] Setup Login to Ranger: - ssh -X [email protected] Make sure you can export graphics
MSU Tier 3 Usage and Troubleshooting. James Koll
MSU Tier 3 Usage and Troubleshooting James Koll Overview Dedicated computing for MSU ATLAS members Flexible user environment ~500 job slots of various configurations ~150 TB disk space 2 Condor commands
Guillimin HPC Users Meeting. Bryan Caron
November 13, 2014 Bryan Caron [email protected] [email protected] McGill University / Calcul Québec / Compute Canada Montréal, QC Canada Outline Compute Canada News October Service Interruption
How To Use The Correlog With The Cpl Powerpoint Powerpoint Cpl.Org Powerpoint.Org (Powerpoint) Powerpoint (Powerplst) And Powerpoint 2 (Powerstation) (Powerpoints) (Operations
orrelog SQL Table Monitor Adapter Users Manual http://www.correlog.com mailto:[email protected] CorreLog, SQL Table Monitor Users Manual Copyright 2008-2015, CorreLog, Inc. All rights reserved. No part
Manual for using Super Computing Resources
Manual for using Super Computing Resources Super Computing Research and Education Centre at Research Centre for Modeling and Simulation National University of Science and Technology H-12 Campus, Islamabad
From Rivals to BFF: WAF & VA Unite OWASP 07.23.2009. The OWASP Foundation http://www.owasp.org
From Rivals to BFF: WAF & VA Unite 07.23.2009 Brian Contos, Chief Security Strategist Imperva Inc. [email protected] +1 (650) 832.6054 Copyright The Foundation Permission is granted to copy, distribute
Database Maintenance Guide
Database Maintenance Guide Medtech Evolution - Document Version 5 Last Modified on: February 26th 2015 (February 2015) This documentation contains important information for all Medtech Evolution users
HelpSystems Web Server User Guide
HelpSystems Web Server User Guide Copyright Copyright HelpSystems, LLC. Robot is a division of HelpSystems. HelpSystems Web Server, OPAL, OPerator Assistance Language, Robot ALERT, Robot AUTOTUNE, Robot
Windows Server Performance Monitoring
Spot server problems before they are noticed The system s really slow today! How often have you heard that? Finding the solution isn t so easy. The obvious questions to ask are why is it running slowly
LISTSERV in a High-Availability Environment DRAFT Revised 2010-01-11
LISTSERV in a High-Availability Environment DRAFT Revised 2010-01-11 Introduction For many L-Soft customers, LISTSERV is a critical network application. Such customers often have policies dictating uptime
Incremental Backup Script. Jason Healy, Director of Networks and Systems
Incremental Backup Script Jason Healy, Director of Networks and Systems Last Updated Mar 18, 2008 2 Contents 1 Incremental Backup Script 5 1.1 Introduction.............................. 5 1.2 Design Issues.............................
Deploying LoGS to analyze console logs on an IBM JS20
Deploying LoGS to analyze console logs on an IBM JS20 James E. Prewett The Center for High Performance Computing at UNM (HPC@UNM) 1 Introduction In early 2005, The Center for High Performance Computing
Pattern Insight Clone Detection
Pattern Insight Clone Detection TM The fastest, most effective way to discover all similar code segments What is Clone Detection? Pattern Insight Clone Detection is a powerful pattern discovery technology
PCVITA Express Migrator for SharePoint(Exchange Public Folder) 2011. Table of Contents
Table of Contents Chapter-1 ------------------------------------------------------------- Page No (2) What is Express Migrator for Exchange Public Folder to SharePoint? Migration Supported The Prominent
Bring your intranet to the IBM i With Drupal and Zend Server
Bring your intranet to the IBM i With Drupal and Zend Server Mike Pavlak Solution Consultant [email protected] Insert->Header 1 & Footer Audience Manager looking for Intranet/place to put stuff Developers
