icer Bioinformatics Support Fall 2011

Similar documents
Introduction to ACENET Accelerating Discovery with Computational Research May, 2015

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

The Asterope compute cluster

Getting Started with HPC

Overview of HPC Resources at Vanderbilt

Berkeley Research Computing. Town Hall Meeting Savio Overview

locuz.com HPC App Portal V2.0 DATASHEET

Parallel Debugging with DDT

HPC at IU Overview. Abhinav Thota Research Technologies Indiana University

An introduction to compute resources in Biostatistics. Chris Scheller

NEC HPC-Linux-Cluster

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Manual for using Super Computing Resources

Data management on HPC platforms

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

SURFsara HPC Cloud Workshop

Working with HPC and HTC Apps. Abhinav Thota Research Technologies Indiana University

Introduction to Supercomputing with Janus

The Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist

An Introduction to High Performance Computing in the Department

bwgrid Treff MA/HD Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 29.

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

HPC Wales Skills Academy Course Catalogue 2015

GEM Network Advantages and Disadvantages for Stand-Alone PC

The CNMS Computer Cluster

Parallel Processing using the LOTUS cluster

inforouter V8.0 Server & Client Requirements

Using NeSI HPC Resources. NeSI Computational Science Team

Datzilla. Error Reporting and Tracking for NOAA Data

Parallels Plesk Automation

22S:295 Seminar in Applied Statistics High Performance Computing in Statistics

Linux für bwgrid. Sabine Richling, Heinz Kredel. Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim. 27.

High Performance Computing in CST STUDIO SUITE

The Benefits of Verio Virtual Private Servers (VPS) Verio Virtual Private Server (VPS) CONTENTS

PLGrid Infrastructure Solutions For Computational Chemistry

SURFsara HPC Cloud Workshop

Introduction to HPC Workshop. Center for e-research

Run your own Oracle Database Benchmarks with Hammerora

Network operating systems typically are used to run computers that act as servers. They provide the capabilities required for network operation.

Parallel Programming for Multi-Core, Distributed Systems, and GPUs Exercises

Introduction to Running Hadoop on the High Performance Clusters at the Center for Computational Research

BioHPC Web Computing Resources at CBSU

Git - Working with Remote Repositories

Case Study 2 SPR500 Fall 2009

Software infrastructure and remote sites

wu.cloud: Insights Gained from Operating a Private Cloud System

II. Installing Debian Linux:

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

ABAQUS High Performance Computing Environment at Nokia

Grid 101. Grid 101. Josh Hegie.

Copyright by Parallels Holdings, Ltd. All rights reserved.

Functions of NOS Overview of NOS Characteristics Differences Between PC and a NOS Multiuser, Multitasking, and Multiprocessor Systems NOS Server

CONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities

bwgrid Treff MA/HD Sabine Richling, Heinz Kredel Universitätsrechenzentrum Heidelberg Rechenzentrum Universität Mannheim 24.

Cloud Computing Solutions for Genomics Across Geographic, Institutional and Economic Barriers

A Study of Data Management Technology for Handling Big Data

Remote & Collaborative Visualization. Texas Advanced Compu1ng Center

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

Microsoft Compute Clusters in High Performance Technical Computing. Björn Tromsdorf, HPC Product Manager, Microsoft Corporation

LabStats 5 System Requirements

Work Environment. David Tur HPC Expert. HPC Users Training September, 18th 2015

Kashif Iqbal - PhD Kashif.iqbal@ichec.ie

Computer Virtualization in Practice

UMass High Performance Computing Center

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

Introduction to bioknoppix: Linux for the life sciences

HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Automating Big Data Benchmarking for Different Architectures with ALOJA

System Requirements G E N E R A L S Y S T E M R E C O M M E N D A T I O N S

UGENE Quick Start Guide

LANL Computing Environment for PSAAP Partners

User s Manual

Compiler-Assisted Binary Parsing

Sponsored by: Speaker: Brian Madden, Independent Industry Analyst and Blogger

Labnet Services Summary

INSTALLATION GUIDE ENTERPRISE DYNAMICS 9.0

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

Debugging and Profiling Lab. Carlos Rosales, Kent Milfeld and Yaakoub Y. El Kharma

LOCKSS on LINUX. Installation Manual and the OpenBSD Transition 02/17/2011

Sawmill Log Analyzer Best Practices!! Page 1 of 6. Sawmill Log Analyzer Best Practices

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

Scaling from 1 PC to a super computer using Mascot

Hodor and Bran - Job Scheduling and PBS Scripts

LOCKSS on LINUX. CentOS6 Installation Manual 08/22/2013

Cloud BioLinux: Pre-configured and On-demand Bioinformatics Computing for the Genomics Community

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education

Performance Test Report: Novell iprint Appliance 1.1

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

Transcription:

icer Bioinformatics Support Fall 2011 John B. Johnston HPC Programmer Institute for Cyber Enabled Research 2011 Michigan State University Board of Trustees.

Institute for Cyber Enabled Research (icer) Hardware (HPCC) Software and Support Education Consulting Collaboration

icer: What is it? The Institute for Cyber Enabled Research (icer) at Michigan State University (MSU) was established to coordinate and support multidisciplinary resource for computation and computational sciences. The Center's goal is to enhance MSU's national and international presence and competitive edge in disciplines and research thrusts that rely on advanced computing.

HPCC: What is it? The HPCC provides computational hardware and support to MSU faculty, students and researchers. The HPCC is contained within icer; effectively representing the hardware, systems and software arm of icer s research support mission.

Bioinformatics Outreach HPCC hardware Software resources Help Desk Seminars One-on-one Consulting Limited on-site systems setup and configuration Programming and scripting assistance FREE! wiki.hpcc.msu.edu/display/bioinfo/bioinformatics+support+at+msu

HPCC Cluster Overview Linux operating system Primary interface is text based though Secure Shell (ssh) All Machines in the main cluster are binary compatible (compile once, run anywhere) Each user has 50Gigs of personal hard drive space. /mnt/home/username/ Users have access to 33TB of scratch space. /mnt/scratch/username/ A scheduler is used to manage jobs running on the cluster A submission script is used to tell the scheduler the resources required and how to run a job A Module system is used to manage the loading and unloading of software configurations

Gateway to the System Access to HPCC is primarily though the gateway machine: ssh username@hpc.msu.edu ssh username@gateway.hpcc.msu.edu Access to all HPCC services uses MSU username and password. Once in, you can go to the user-oriented destination of choice.

HPCC System Diagram

Why the HPCC Cluster? Large data sets Lots of number crunching A need to run many simultaneous jobs with different data sets and/or configuration settings You need software you don t have, don t want to / can t setup Comprehensive readymade development environment that is actively administered

Linux? OH NOES! If you are a Linux pro, go ahead and take a short nap (you ve got ~60 seconds) If you re not, don t worry! That s why I get the (not so) big bucks. The Bioinformatics Help Desk is here to get you up and running.

Linux Support Client application selection Bring in your laptop (if you have one) Cookbook tutorials and cheat sheets (more on the way) One-on-one consultation Limited on-site support and training We also provide samba support for Windows and Mac boxes so you can map your HPCC account directory to your workstation

HPCC Online Resources www.hpcc.msu.edu HPCC home wiki.hpcc.msu.edu Public/Private Wiki forums.hpcc.msu.edu User forums rt.hpcc.msu.edu Help desk request tracking mon.hpcc.msu.edu System Monitors

Available Software Center Supported Development Software Intel compilers, openmp, openmpi, mvapich, totalview, mkl, pathscale, gnu... Center Supported Research Software Matlab, R, amber, blast, charmm, emboss... Customer Software (module use.cus) Clustalw, QuEST, MEME, Velvet, mpiblast, bowtie, AMOS, ABySS, MUMmer, HMMER, phylip, SAMTools For a more up to date list, see the documentation wiki: http://wiki.hpcc.msu.edu/ Don t See it Here? Ask for it, we ll try to help

User Software 50GB of initial user space provided Install your own in user space HPCC offers a rich build environment Quota increases can be made available Code installation and (modest) modification support is available through moi

Virtual Machines Virtual Servers expressed in software Available for research labs/working groups Flavors currently available: Galaxy BLAST (web browser based) UCSC Genome Browser more on the way...

Database Offerings db-01: Internal MySQL database node attached to the cluster. Host user datasets of modest size. BLAST database repository VM-based UCSC for example Up to 1TB total user space for free, $250/yr. per TB thereafter

Multiprocessor Apps Many bioinformatics applications are beginning to appear in multiprocessor-capable versions. Workload can be divided to allow each processor to complete part of the job in parallel, decreasing run time. HPC provides accessibility to a large number of processing cores, memory, and disk space.

Some Examples Multithreaded BLAST shared memory mpiblast distributed memory Velvet Assembler multithreaded shared mem MAKER2 MPI, distributed memory OpenMP, OpenMPI, MVAPICH

Cluster Developer Nodes Developer Nodes are accessible from gateway and used for testing. ssh dev-amd05 Same hardware as amd05 ssh dev-intel07 Same hardware as intel07 ssh dev-intel10 Same hardware as intel10 ssh dev-amd09 Same hardware as amd09 ssh dev-gfx10 Same hardware as gfx10 We periodically have some test boxes: ssh dev-gfx08 Nvidia Graphics Processing Node ssh dev-cell08 Playstation 3 Cell processor ssh dev-intel09 8 core Intel Xeon with 24GB of memory Jobs running on the developer nodes should be limited to two hours of walltime. Developer nodes are shared by everyone.

HPCC System Diagram

Steps in Using the HPCC Connect to HPCC Determine required software Transfer required input files and source code Compile programs (if needed) Test software/programs on a developer node Write a submission script Submit the job Get your results and write a paper!!

A couple of examples Biological model long running, many similar but not identical runs Multiprocessor BLAST searches Multiprocessor Velvet assembly Use of the HPCC cluster was able to produce more results in less time, with little or no active user management

But I don t need a cluster Tool selection, setup Scripting assistance Data browsing, sharing, group analysis Lab help or training

Scripting Customized, standardized, modify Python, Perl, or? We have a growing collection available as a Git repository. Perhaps you don t know anything about scripting; or maybe you do, but could use some help?

Tutorials Titus Brown's ANGUS-NGS tutorials, converted for using examples on HPC instead of Amazon Using UCSC for certain tasks mpiblast Velvet and Oases Others being developed...

Seminars and Education NextGen Bioinformatics Seminars wiki.hpcc.msu.edu/display/bioinfo/nextgen+bioinformatics+seminars HPCC Mid-Morning Break wiki.hpcc.msu.edu/display/announce/hpcc+mid-morning+break+series

Setting up an account All account requests must come via a PI. Have your PI fill-in the form at: www.hpcc.msu.edu/request Once received, we will process your request and notify you when your account is ready.

Bioinformatics Contact John Johnston, HPC Programmer M-W, 1449 BPS, 884-2572 Th-F, 505 BMB, 432-7177 johnj@msu.edu Ticket requests: https://rt.hpcc.msu.edu/index.html Please include Bioinformatics Help in the subject to more quickly route your request.