Distributed Computing under Linux



Similar documents
Facultat d'informàtica de Barcelona Univ. Politècnica de Catalunya. Administració de Sistemes Operatius. System monitoring

System Administration

Cluster Computing at HRI

Performance monitoring. in the GNU/Linux environment. Linux is like a wigwam - no Windows, no Gates, Apache inside!

Copyright by Parallels Holdings, Ltd. All rights reserved.

Parallels Virtuozzo Containers 4.7 for Linux Readme

Overlapping Data Transfer With Application Execution on Clusters

MOSIX: High performance Linux farm

Recommended hardware system configurations for ANSYS users

PARALLELS SERVER 4 BARE METAL README

TOP(1) Linux User s Manual TOP(1)

PARALLELS SERVER BARE METAL 5.0 README

Efficient Load Balancing using VM Migration by QEMU-KVM

HPSA Agent Characterization

These sub-systems are all highly dependent on each other. Any one of them with high utilization can easily cause problems in the other.

Monitoring Linux with native tools

Acronis Backup & Recovery 10 Server for Linux. Installation Guide

WHITE PAPER. ClusterWorX 2.1 from Linux NetworX. Cluster Management Solution C ONTENTS INTRODUCTION

Scalability and Classifications

Agenda. Capacity Planning practical view CPU Capacity Planning LPAR2RRD LPAR2RRD. Discussion. Premium features Future

ABB Technology Days Fall 2013 System 800xA Server and Client Virtualization. ABB Inc 3BSE en. October 29, 2013 Slide 1

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

How To Install Linux Titan

International Journal of Computer & Organization Trends Volume20 Number1 May 2015

Acronis Backup & Recovery 10 Server for Windows. Installation Guide

Operating System and Process Monitoring Tools

LAN/Server Free* Backup Experiences

This is when a server versus a workstation is desirable because it has the capability to have:

Kaseya IT Automation Framework

OpenMosix Presented by Dr. Moshe Bar and MAASK [01]

Improved metrics collection and correlation for the CERN cloud storage test framework

CatDV-StorNext Archive Additions: Installation and Configuration Guide

Web Server (Step 1) Processes request and sends query to SQL server via ADO/OLEDB. Web Server (Step 2) Creates HTML page dynamically from record set

Kerrighed: use cases. Cyril Brulebois. Kerrighed. Kerlabs

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

Load Distribution on a Linux Cluster using Load Balancing

Parallels Cloud Server 6.0

Virtuozzo 7 Technical Preview - Virtual Machines Getting Started Guide

Cloud Server. Parallels. An Introduction to Operating System Virtualization and Parallels Cloud Server. White Paper.

Linux Scheduler Analysis and Tuning for Parallel Processing on the Raspberry PI Platform. Ed Spetka Mike Kohler

SRNWP Workshop. HP Solutions and Activities in Climate & Weather Research. Michael Riedmann European Performance Center

Parallels Virtuozzo Containers

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Web Application s Performance Testing

QuickSpecs. HP Data Protector Express Single Server Edition 4.0 Service Pack 1 Overview

How To Fix A Powerline From Disaster To Powerline

AdminToys Suite. Installation & Setup Guide

Chapter 1: Introduction. What is an Operating System?

VIRTUALIZATION AND CPU WAIT TIMES IN A LINUX GUEST ENVIRONMENT

locuz.com HPC App Portal V2.0 DATASHEET

Product Brief SysTrack VMP

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

Enterprise Cloud VM Image Import User Guide. Version 1.0

CDD user guide. PsN Revised

IBM Deep Computing Visualization Offering

Acronis Backup & Recovery 11.5

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version Fix Pack 2.

UNIX System Administration S Y STEMS

Kaspersky Endpoint Security 8 for Linux INSTALLATION GUIDE

PARALLELS CLOUD SERVER

TapeWare THE ONE SOLUTION FOR BACKUP

International Engineering Journal For Research & Development

Cloud n Service Presentation. NTT Communications Corporation Cloud Services

Quick Start Guide for Parallels Virtuozzo

UNISOL SysAdmin. SysAdmin helps systems administrators manage their UNIX systems and networks more effectively.

High Performance Computing in CST STUDIO SUITE

Open Source Terminal Server Architecture for Enterprise Environment

Imaging Computing Server User Guide

Centralized Orchestration and Performance Monitoring

1 Download & Installation Usernames and... Passwords

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Partek Flow Installation Guide

High-Density Network Flow Monitoring

W H I T E P A P E R. Optimized Backup and Recovery for VMware Infrastructure with EMC Avamar

Virtualization in Linux

Installing and Configuring Websense Content Gateway

Using esxtop to Troubleshoot Performance Problems

Acronis Backup & Recovery 11

Product Training Services. Training Options and Procedures for JobScheduler and YADE

Administration Guide VMware Server 1.0

Acronis Backup & Recovery 11.5

FileMaker Server 7. Administrator s Guide. For Windows and Mac OS

Implementing Active Directory Rights Management Services with Exchange and SharePoint

EView/400i Management Pack for Systems Center Operations Manager (SCOM)

This document describes the new features of this release and important changes since the previous one.

Parallels Cloud Server 6.0 Readme

Big Data and Cloud Computing for GHRSST

Transcription:

Distributed Computing under Linux Lars Lindbom and E. Niclas Jonsson Division of Pharmacokinetics and Drug Therapy Department of Pharmaceutical Biosciences Uppsala University

Overview What is Distributed Computing? Enabling distributed computation What are the potential benefits of DC in population PK/PD? Example Bootstrap Example of a small cluster Conclusions

What is Distributed Computing? It is using two workstations to solve one task and gaining 100% in speed It is using 1000 processors (CPUs) to solve a task that otherwise would have been (practically) impossible to solve. It is using available resources more effeciently

Enabling Distributed Computation Hardware Network Personal Computers Operating System Fileserver, Tape backup Methods (Software) Computing Nodes Cluster

Enabling distributed computation Hardware There is more to distributed computing than connecting computers to a common network Operating System Methods (Software) Making the computing nodes talk to each other and cooperate Identifying the parts of a scientific problem that can be parallelized

What are the potential benefits of DC in population PK/PD? We rely on computers to fit models to data. Preferably, this task should be written for the potential use of multiple processors No software support this today There are other tasks within a population analysis involving multiple model fits where one fit not necessarily depend on a previous. Model building Model validation

Model Building Model Validation Automated Stepwise Covariate Model Building Model Selection using Genetic Algorithms Jackknife Log-Likelihood Profiling Cross Model Validation Bootstrap Case Deletion Diagnostics Cross Validation Posterior Predictive Check J.S. Urban Hjorth, Computer Intensive Statistical Methods, (Chapman & Hall, New York, 1994)

Computer intensive methods Often based on repetition of nearly identical tasks that can be performed independently of each other Often embarrassingly parallel

Example; The Bootstrap Assume that we have a model fit to a data set Assume further that we suspect that the true confidence interval around our estimate of clearance is non-symmetric

Bootstrap procedure: Draw 2000 new data sets with replacement from our data set Refit the model for the 2000 bootstrap samples Compute the confidence intervals for clearance from the distribution of the bootstrap estimates R. T. B. Efron, An introduction to the bootstrap, (Chapman & Hall, New York, 1993)

95 % Confidence interval 500 400 Original estimate 300 200 100 0 0.00 1.88 3.76 5.64 7.52 9.40 11.28 13.16 CL

Example of a small cluster, prerequisites The cluster should be Easy to set up Cheap (low total cost for hardware, software and administration) Independent of operating system Independent of processor architecture As far as possible independent of third party software

Example of a small cluster Hardware Network (100 Mb/s Ethernet) Personal Computers Operating System File server Methods (Software) Backup server Pool of personal desktops and workstations 5 cpus Dedicated servers and workstations 10 cpus Computing Nodes

Cluster enabled operating system Linux and openmosix Hardware Operating System Methods (Software) Open source operating system Linux Red Hat Linux 7.3 Open source Single System Image Clustering add-on to Linux openmosix for kernel 2.4.19

Network P ri nt Server Po wer/t X Link/Rx LPT1 LP T2 COM 2x1 GHz Front end node 1x2.6 GHz Long NONMEM run Po wer/t X Link/Rx LPT1 LP T2 COM P ri nt Server 1 h NONMEM run 2x1 GHz 1x2.6 GHz Bootstrap, 5 parallel runs 1x700 MHz

110 processes: 107 sleeping, 3 running, 0 zombie, 0 stopped CPU states: 1145.0% user, 5.8% system, 944.2% nice, 0.0% idle Mem: 514128K av, 453064K used, 61064K free, 0K shrd, 2808K buff Swap: 1052216K av, 324K used, 1051892K free 77664K cached PID USER PRI NI SIZE RSS SHARE STAT N# %CPU %MEM TIME COMMAND 1639 grant 11 0 1236 1236 4 S 13 99.9 0.2 77:00 nonmem 2034 anja 19 19 1788 1788 488 S N 13 99.9 0.3 2:02 nonmem 2035 anja 19 19 1800 1800 500 S N 3 99.9 0.3 2:05 nonmem 2036 anja 19 19 1796 1796 496 S N 17 99.9 0.3 2:06 nonmem 2043 anja 19 19 1796 1796 492 S N 5 99.9 0.3 2:09 nonmem 2045 anja 19 19 1792 1792 496 S N 4 99.9 0.3 1:58 nonmem 2046 anja 19 19 1796 1796 496 R N 0 99.9 0.3 1:38 nonmem 2047 anja 19 19 1788 1788 492 S N 5 99.9 0.3 2:04 nonmem 2050 anja 20 19 1788 1788 492 S N 4 98.8 0.3 2:07 nonmem 1715 grant 17 0 1772 1772 460 S 19 98.6 0.3 59:51 nonmem 2049 anja 19 19 1796 1796 496 S N 18 50.3 0.3 1:29 nonmem 2044 anja 19 19 1808 1808 504 S N 18 50.2 0.3 1:50 nonmem 2048 anja 19 19 1616 1616 320 R N 0 45.6 0.3 0:53 nonmem 2051 root 9 0 2108 2108 1740 S 0 0.3 0.4 0:00 sshd 2116 lasse 9 0 1040 1040 824 R 0 0.3 0.2 0:00 mtop 2054 lasse 9 0 4212 4212 3396 S 0 0.1 0.8 0:00 gnome-terminal 1 root 8 0 476 476 420 S 0 0.0 0.0 0:08 init 2 root 8 0 0 0 0 SW 0 0.0 0.0 0:00 keventd 3 root 19 19 0 0 0 SWN 0 0.0 0.0 0:00 ksoftirqd_cpu0 4 root 19 19 0 0 0 SWN 0 0.0 0.0 0:00 ksoftirqd_cpu1

Benefits - Single NONMEM jobs on an openmosix cluster Hardware ~40 users during the past three years At each day, 5-7 people are running jobs on the cluster Operating System Methods (Software) Better usage of a heterogeneous computer pool The longest runs get the fastest CPUs Easier administration of one front end node than of many computational servers

Perl-speaks-NONMEM (PsN) Hardware Operating System Methods (Software) Computer intensive methods using NONMEM shares many common tasks: Opening, reading, changing, writing to and saving NONMEM-files (model files, data files and output files) Running NONMEM in a controlled fashion, registering the termination and analyzing the result. PsN is intended to provide a common programming library for method development using NONMEM

Perl-speaks-NONMEM Perl Has effective handling of text files Has good support for invoking system calls within scripts. Supports parallel execution Is platform independent

Perl-speaks-NONMEM PsN is object oriented Object classes have been created, using the NONMEM files as basis Model, data and output classes. The classes include methods for many tasks, e.g. Extracting parameter estimates or termination status Splitting or resampling of data Changing initial estimates, etc

PsN based methods Examples of methods developed using PsN - Automated covariate model building - Bootstrap - Log-likelihood Profiling - Case Deletion diagnostics The current version of PsN is 2.0 and it can be obtained for free.

Own experiences and thoughts The demand for system administration of a small cluster is 1/3-1/2 of one full time employee Hardware Operating System Methods (Software) Very short runs (<10 sec) do not benefit from a distributed environment overhead of data transfer between nodes is too high Other solutions for distributed computing exist expanding area OpenMosix scales well in this application to at least 20 CPUs

Conclusions Affordable, fairly simple solutions for distributed computing within the population PK/PD area exist A distributed computing setup is (presently) a pre-requisite for the use of computer intensive methods in population PK/PD

References openmosix Homepage, http://openmosix.sourceforge.net Red Hat Linux Homepage, http://www.redhat.com PsN, Lars.Lindbom@farmbio.uu.se