Redbooks Paper. Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3: Virtualization Exploitation through Micro-Partitioning Implementation



Similar documents
Redbooks Paper. Local versus Remote Database Access: A Performance Test. Victor Chao Leticia Cruz Nin Lei

Disaster Recovery Procedures for Microsoft SQL 2000 and 2005 using N series

QLogic 4Gb Fibre Channel Expansion Card (CIOv) for IBM BladeCenter IBM BladeCenter at-a-glance guide

How To Write A Laboratory Report

IBM Cognos Controller Version New Features Guide

Big Data Analytics with IBM Cognos BI Dynamic Query IBM Redbooks Solution Guide

AN EXAMPLE REPORT. Cecil Dybowski Here you list all names of people involved, along with addresses.

Emulex 8Gb Fibre Channel Expansion Card (CIOv) for IBM BladeCenter IBM BladeCenter at-a-glance guide

Case Study: Process SOA Scenario

Rapid Data Backup and Restore Using NFS on IBM ProtecTIER TS7620 Deduplication Appliance Express IBM Redbooks Solution Guide

SAS deployment on IBM Power servers with IBM PowerVM dedicated-donating LPARs

Active Directory Synchronization with Lotus ADSync

IBM PowerSC Technical Overview IBM Redbooks Solution Guide

Improving Compute Farm Throughput in Electronic Design Automation (EDA) Solutions

IBM Enterprise Marketing Management. Domain Name Options for

IBM Financial Transaction Manager for ACH Services IBM Redbooks Solution Guide

Integrating ERP and CRM Applications with IBM WebSphere Cast Iron IBM Redbooks Solution Guide

IBM Security QRadar Version (MR1) Checking the Integrity of Event and Flow Logs Technical Note

IBM Rational Rhapsody NoMagic Magicdraw: Integration Page 1/9. MagicDraw UML - IBM Rational Rhapsody. Integration

QLogic 8Gb FC Single-port and Dual-port HBAs for IBM System x IBM System x at-a-glance guide

IBM Cognos Controller Version New Features Guide

IBM Enterprise Marketing Management. Domain Name Options for

IR multiphoton absorption spectra of some freon molecules used in 13 C isotope separation A. Bende 1, 2 and V. Toşa 1

IBM Flex System PCIe Expansion Node IBM Redbooks Product Guide

Tivoli Endpoint Manager for Security and Compliance Analytics. Setup Guide

Packet Capture Users Guide

Getting Started with IBM Bluemix: Web Application Hosting Scenario on Java Liberty IBM Redbooks Solution Guide

Creating Applications in Bluemix using the Microservices Approach IBM Redbooks Solution Guide

Brocade Enterprise 20-port, 20-port, and 10-port 8Gb SAN Switch Modules IBM BladeCenter at-a-glance guide

SmartCloud Monitoring - Capacity Planning ROI Case Study

Platform LSF Version 9 Release 1.2. Migrating on Windows SC

Redbooks Redpaper. IBM TotalStorage NAS Advantages of the Windows Powered OS. Roland Tretau

IBM FileNet Capture and IBM Datacap

IBM Tivoli Web Response Monitor

WebSphere Application Server V6: Diagnostic Data. It includes information about the following: JVM logs (SystemOut and SystemErr)

Release Notes. IBM Tivoli Identity Manager Oracle Database Adapter. Version First Edition (December 7, 2007)

IBM RDX USB 3.0 Disk Backup Solution IBM Redbooks Product Guide

115 EÜFBED - Fen Bilimleri Enstitüsü Dergisi Cilt-Sayı: 3-1 Yıl:

QM/QM Study of the Coverage Effects on the Adsorption of Amino-Cyclopentene at the Si(100) Surface

Maximizing Backup and Restore Performance of Large Databases

IBM Security QRadar Version Installing QRadar with a Bootable USB Flash-drive Technical Note

IBM DB2 Data Archive Expert for z/os:

IBM SmartCloud Analytics - Log Analysis. Anomaly App. Version 1.2

Getting Started With IBM Cúram Universal Access Entry Edition

Installing on Windows

Broadcom NetXtreme Gigabit Ethernet Adapters IBM Redbooks Product Guide

IBM Security QRadar Version (MR1) Replacing the SSL Certificate Technical Note

Cúram Business Intelligence and Analytics Guide

High Performance Computing Cloud Offerings from IBM Technical Computing IBM Redbooks Solution Guide

ServeRAID H1110 SAS/SATA Controller for IBM System x IBM System x at-a-glance guide

IBM Configuring Rational Insight and later for Rational Asset Manager

Systemverwaltung 2009 AIX / LPAR

IBM Tivoli Service Request Manager 7.1

IBM TRIRIGA Version 10 Release 4.2. Inventory Management User Guide IBM

Tivoli Security Compliance Manager. Version 5.1 April, Collector and Message Reference Addendum

Communications Server for Linux

Remote Support Proxy Installation and User's Guide

IBM Security QRadar Version (MR1) Installing QRadar 7.1 Using a Bootable USB Flash-Drive Technical Note

Version 8.2. Tivoli Endpoint Manager for Asset Discovery User's Guide

New!! - Higher performance for Windows and UNIX environments

IBM Flex System FC port 16Gb FC Adapter IBM Redbooks Product Guide

IBM z13 for Mobile Applications

Redpaper. IBM Workplace Collaborative Learning 2.5. A Guide to Skills Management. Front cover. ibm.com/redbooks. Using the skills dictionary

IBM FlashSystem. SNMP Guide

Tivoli IBM Tivoli Monitoring for Transaction Performance

Redpaper. Performance Test of Virtual Linux Desktop Cloud Services on System z

IBM Security QRadar Version (MR1) Configuring Custom Notifications Technical Note

IBM Security QRadar Version Common Ports Guide

Tivoli Endpoint Manager for Security and Compliance Analytics

Redpaper. Virtual I/O Server Deployment Examples. Advanced POWER Virtualization on IBM System p. Front cover. ibm.com/redbooks

Tivoli Endpoint Manager for Configuration Management. User s Guide

Sametime Version 9. Integration Guide. Integrating Sametime 9 with Domino 9, inotes 9, Connections 4.5, and WebSphere Portal

Linux. Managing security compliance

IBM TRIRIGA Anywhere Version 10 Release 4. Installing a development environment

IBM Enterprise Content Management Software Requirements

Fundtech offers a Global Payments Solution on IBM Power Systems IBM Redbooks Solution Guide

S/390 Virtual Image Facility for LINUX Guide and Reference

Patch Management for Red Hat Enterprise Linux. User s Guide

z/os V1R11 Communications Server System management and monitoring Network management interface enhancements

Efficient Parallel Execution of Sequence Similarity Analysis Via Dynamic Load Balancing

ServeRAID M5000 Series Performance Accelerator Key for IBM System x IBM System x at-a-glance guide

IBM FileNet System Monitor FSM Event Integration Whitepaper SC

The MAX5 Advantage: Clients Benefit running Microsoft SQL Server Data Warehouse (Workloads) on IBM BladeCenter HX5 with IBM MAX5.

DataPower z/os crypto integration

IBM XIV Management Tools Version 4.7. Release Notes IBM

Front cover Smarter Backup and Recovery Management for Big Data with Tectrade Helix Protect

IBM VisualAge for Java,Version3.5. Remote Access to Tool API

z/os V1R11 Communications Server system management and monitoring

CS z/os Application Enhancements: Introduction to Advanced Encryption Standards (AES)

Redpaper. Integrated Virtualization Manager on IBM System p5. Front cover. ibm.com/redbooks. No dedicated Hardware Management Console required

IBM Network Advisor IBM Redbooks Product Guide

IBM TRIRIGA Application Platform Version Reporting: Creating Cross-Tab Reports in BIRT

IBM Endpoint Manager for Software Use Analysis Version 9 Release 0. Customizing the software catalog

iseries Logical Partitioning

Power Management. User s Guide. User s Guide

OS Deployment V2.0. User s Guide

IBM Endpoint Manager Version 9.2. Software Use Analysis Upgrading Guide

Sterling Supplier Portal. Overview Guide. DocumentationDate:9June2013

IBM Security SiteProtector System Migration Utility Guide

Transcription:

Redbooks Paper Carlos P. Sosa Balaji V. Atyam Peter Heyrman Naresh Nayar Jeri Hilsabeck Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3: Virtualization Exploitation through Micro-Partitioning Implementation Abstract In this study, we present a series of benchmarks that exploit virtualization features using IBM POWER5 technology and IBM AIX 5L Version 5.3. We define virtual benchmarks (VBs) based on new functionality introduced in the IBM Eserver iseries and pseries POWER5 technology-based systems and AIX 5L Version 5.3. The benchmarks selected rely on virtualization exploitation through Micro-Partitioning. The applications tested in this study correspond to Gaussian 03 Rev. C.01, BLAST 2.2.6, and AMBER 7. We show that throughput benchmarks running on a system with Micro-Partitioning can take full advantage of a pool of shared processors. In other words, virtual processors improve the time to solution. Copyright IBM Corp. 2005. All rights reserved. ibm.com/redbooks 1

Introduction The idea of virtualization is currently being exploited in many areas within different groups at IBM. In the area of storage solutions, virtualization is considered as a way to help reduce the complexity and costs of managing SAN-based storage. With the IBM TotalStorage Virtualization family, you can manage your storage infrastructure from a single point of control with centralized volume, file, and device management. Together, these products can help you drive down the cost and complexity of managing your storage infrastructures, while providing the flexibility to address rapidly changing storage needs. Another example can be considered in the Virtual Loan Program (VLP). In this case, the idea is to provide ISVs with access to pseries on an on demand basis. This eliminates the costly proposition of providing each ISV with its own computer system, which also delays the availability of applications on a given release. The IBM virtualization engine is one of the most recent efforts being carried out to exploit virtualization technology at multiple levels. It ties very well with on demand business model. The virtualization engine is composed of services and IBM technologies. The key idea is for resources available on IBM servers to function as a single pool that can be more easily managed across the organization [1]. In this study, we look at the part of the virtualization engine that corresponds to logical partition (LPAR) and Micro-Partitioning [1]. Partitioning capabilities have been improved on POWER5 technology-based systems to provide sub-processor partitioning [2,3]. On pseries POWER4 technology-based systems, partitions were constrained to physical processor boundaries. Now, this limitation has been removed, and fractions of physical processors can be used as shared or part of the pool of resources [1,2]. In particular, we explore the usability of this new technology to improve performance on a series of throughput benchmarks. These throughput benchmarks, of course, require the system to be partitioned through Micro-Partitioning. The benchmarks were carried out with three of the most popular applications in Life Sciences and they reflect actual benchmarks requested by customers. Micro-Partitioning The benchmarks carried out here make use of many of the new virtual features of the IBM POWER5 technology-based systems, namely, Micro-Partitioning technology [1]. Micro-Partitioning enables multiple LPARs to run on a physical processor in a time-sliced fashion (see Figure 1 on page 3). The POWER Hypervisor manages the time-slicing of LPARs according to Hardware 2 Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3

Management Console (HMC)-defined parameters [1]. Micro-Partitions are assigned CPU entitlements at the granularity of 1/100 of a CPU (with a minimum of 1/10 CPU per LPAR), where the CPU entitlement is defined as part of a Micro- Partition s profile definition, but can be dynamically changed. Micro-Partitions run with virtual processors. The number of virtual processors is defined as part of a Micro-Partition s image definition. The number of virtual processors can be dynamically changed, and virtual processors are scheduled to physical processors of the shared physical processor pool. On POWER5 technology-based systems, there is a single shared pool that provides the physical processors for all Micro-Partitions. The shared pool size can be dynamically resized by adding or removing physical processors. A virtual processor can be dispatched on any physical processors in the pool. The Hypervisor will attempt to maintain virtual-to-physical processor affinity when it dispatches virtual processors. Figure 1 An example of a system configured with Micro-Partitioning A Micro-Partition can be capped or uncapped [1,2]. A capped Micro-Partition cannot exceed its entitled CPU capacity. Conversely, an uncapped Micro-Partition can use CPU resources beyond its entitlement, as long as there are excess cycles in the pool. When a virtual processor in a shared Micro-Partition reaches its idle loop, it gives up the remaining cycles in its entitled capacity to the Hypervisor so that the cycles can be used by other Micro-Partitions. It is important to note that dedicated processor LPARs continue Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3 3

to be supported; here CPU resources are dedicated and not shared between LPARs. LPAR isolation is maintained for Micro-Partitions. Resources In this section, we describe the resources we used for this study. Hardware To carry out this study, we used one of the recently announced IBM Eserver pseries POWER5. The p5 Model 570 server used here had 16 processors running 1.9 GHz. The system memory consisted of 512 GB (DDR1). The POWER5 processor supports the 64-bit PowerPC architecture. Each chip contains two identical processor cores, where each core supports two identical threads by simultaneous multi-threading (SMT). With SMT, the chip appears as a 4-way processor to the operating system. In this study, we did not use the SMT feature. Each of the cores share a 1.92 MB L2 cache. On the POWER5 technology-based system, the L3 cache directory is on-chip for the off-chip 36 MB L3 cache. Also, the memory controller is integrated on-chip. On POWER5 technology-based systems, the logical partitioning of the machine is substantially different from POWER4 technology-based systems. On POWER5 technology-based systems, physical processors are abstracted into virtual processors. Provided that the system has been configured that way, the physical processors can be shared by multiple logical partitions. Scientific applications We selected three Life Sciences applications that rely on different computational methods to carry out molecular simulations. These applications are in the areas of quantum chemistry, molecular mechanics and molecular dynamics, and bioinformatics. Gaussian [4] is a connected series of programs that can be used for performing a variety of electronic structure calculations: molecular mechanics, semi-empirical, ab initio, and density functional theory. Gaussian consists of a collection of programs commonly known as links. Each link communicates through disk files and are grouped into overlays [5]. Links are independent executables located in the g03 directory and labeled as lxxx.exe; where xxx is the unique number of each link. In general, overlay zero is responsible for starting the program, which includes reading the input file. After the input file is read, the route card (keywords and options that specify all the Gaussian parameters) is translated into a sequence of links. Overlay 99 (l9999.exe) terminates the run; in most 4 Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3

cases, l9999.exe finishes with an archive entry (brief summary of the calculation). The theoretical methods chosen in this study have been extensively discussed in the literature [6], and it is beyond the scope of this work to describe these methods. The approximations used in this work correspond to Hartree-Fock [6]. The case used in this study corresponds to one of the cases from previous studies [7-9]. The molecule α-pinene at the HF level of theory using the 6-311G(df,p) basis set was used as our benchmark. We selected this case because the I/O for this particular calculation is minimal. The I/O capabilities will be tested in a future study [10]. AMBER (Assisted Model Building with Energy Refinement) is a flexible suite of programs for performing molecular mechanics and molecular dynamics calculations based on force fields [11]. Sander is the primary program used for molecular dynamics simulations and is the only program considered in our current study. Sander carries out energy minimization, molecular dynamics, and NMR refinements. AMBER is floating point-intensive FORTRAN code. The version used in this study corresponds to AMBER 7 for IBM systems [12]. The test that we selected to run AMBER is the JAC benchmark. This is a joint AMBER-CHARMM benchmark. It considers a protein dhfr (dihydrofolate reductase) in an explicit water bath with cubic periodic boundary conditions. Details of system size and simulation conditions are 23,558 atoms, cubic periodic box, 62.23 Å dimension, 9Å nonbond cutoff with 2Å buffer, that is, list with 11Å cutoff, 1 fs time step, 1000 steps, NVE ensemble (constant energy, constant volume), bonds to hydrogen constrained (SHAKE). The particle mesh Ewald (PME) method was used for calculating the Lennard-Jones (LJ) and electrostatic interactions with the 64x64x64 grid; the equilibration temperature was 300 K. BLAST (Basic Local Alignment Search Tool) is a set of similarity search programs designed to explore all of the available sequence databases regardless of whether the query is protein or nucleic acid [13]. The BLAST programs have been designed for speed, with a minimal sacrifice of sensitivity to distant sequence relationships. The scores assigned in a BLAST search have a well-defined statistical interpretation, making real matches easier to distinguish from random background hits. BLAST uses a heuristic algorithm that seeks local, as opposed to global, alignments and is therefore able to detect relationships among sequences that share only isolated regions of similarity [13]. Life Sciences virtual benchmarks In the area of Life Sciences and for this set of applications, these benchmarks illustrate that Micro-Partitioning allows for increased overall utilization of system resources by automatically making use of additional processors that are part of a Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3 5

shared pool. These processors that are part of the shared processor pool are not associated with dedicated partitions. The partition profiles that we selected for this study are summarized in Table 1. Case I tries to simulate two separate machines, where each machine has eight processors. In this particular case, by definition, there is no shared processor pool. Case II corresponds to a partition (LPAR1) that has been configured with four virtual processors. However, LPAR2 is running in dedicated mode. This case should provide information about differences due to virtual processors. Case III is similar to case II. The main difference is that LPAR2 was shutdown. Cases IV though VI test the benefit of virtual processors on a throughput benchmark. Case IV had four virtual processors. Here, LPAR2 is defined as shared. Case IV is characterized by the fact that no jobs were submitted on LPAR2 while the throughput benchmark was running and completed on LPAR1. Cases V and VI are similar to the previous case, except that 4 and 8 jobs were submitted to LPAR2, respectively. Table 1 Profile used for LPAR1 and LPAR2 for cases I-VI Parameters Case I Case II Case III Case IV Case V Case VI LPAR1 profile Desired processor capacity Desired virtual processors N/A 8 8 8 8 8 8 12 12 12 12 12 Capped N/A No No No No No Variable capacity weight N/A 128 128 128 128 128 LPAR2 profile Desired processor capacity Desired virtual processors N/A N/A N/A 8 8 8 8 8 8 8 8 8 Capped N/A N/A N/A No No No Variable capacity weight N/A N/A N/A 128 128 128 Partitions mode LPAR1 dedicated and LPAR2 dedicated. LPAR2 up and running in dedicated mode. LPAR2 was shutdown. LPAR2 was shared, up and idle, empty. LPAR2 was shared, up and 4 jobs running. LPAR2 was shared, up and 8 jobs running. 6 Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3

Table 2 summarizes the elapsed time for our Gaussian 03 example as function of the previously defined cases. Column 1 corresponds to the number of jobs that were submitted at the same time. Once submitted, only these jobs ran on the corresponding partition. All the times for two or more jobs are compared to a single job running on a dedicated system. Clearly, in this table, we see that as we increase the number of jobs running on the machine, the elapsed time increases. Of course, the elapsed time will depend on the particular case. The times reported in Table 2 correspond to the average time computed over the number jobs for each throughput run. Table 2 Average elapsed times for the Gaussian 03 application Number of jobs Average elapsed time in seconds Case I Case II Case III Case IV Case V Case VI 1 516 499 504 501 498 511 2 516 497 499 506 500 508 4 517 505 502 503 504 510 8 539 530 503 503 514 542 10 657 660 509 508 520 745 16 1053 1076 663 659 680 1085 The information presented in Table 3 on page 8 summarizes the percentage difference of each case compared to case I. In other words, we want to know what the effect is of having a pool of shared virtual processors. We compute the percentage difference ( %) of all the other cases compared to case I. A positive number represents slowdown and a negative number represents speedup. We chose case I as the baseline, because this case was defined without a pool of shared virtual processors. In the case of Gaussian 03, case I simulates a stand-alone system with eight processors. The benchmark that we use is a throughput benchmark with 2N jobs, where N is the number of processors. In other words, the maximum that we over subscribe the machine is 2N. Case II shows that defining virtual processors introduces some slight slowdown when running 10 and 16 jobs. This is not surprising because it has been previously reported [2]. However, this enables us to quantify it, and for this case, such a slowdown is of the order of 2%, where 2% is for extreme cases. The difference between case II and case III is that LPAR1 had access to all the resources on the machine since LPAR2 was shutdown. Clearly, this example illustrates the effect of the virtual processors when available. Similarly, as can be seen in case IV, this time, LPAR2 was idle; thus, all the resources could be made available to LPAR1 when needed. Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3 7

Cases V and VI try to simulate production environments where the second partition (LPAR2) might be partially busy or totally busy. Case V corresponds to an environment where LPAR2 is partially busy and some of the resources might be available for LPAR1, which is fully subscribed. In this case, we see a behavior consistent with the previous case. However, in case VI, we clearly see that there are no additional resources, because both partitions are fully used. In this case, the performance slowdown is similar to what we saw in case I: Slightly higher for jobs 8, 10, and 12, except for 10 jobs that are abnormally higher in case VI. Table 3 Percentage difference for cases II through VI when compared to case I for Gaussian 03 application Number of jobs % Percentage difference Case I Case II Case III Case IV Case V Case VI 1 0-3 -2-3 -3-1 2 0-4 -3-2 -3-2 4 0-2 -3-3 -3-1 8 0-2 -7-7 -5 1 10 0 0-23 -23-21 13 16 0 2-37 -37-35 3 Table 4 on page 9 presents similar information as in the case of Gaussian 03. This table summarizes all the average elapsed times for the AMBER 7 throughput runs. The trends are basically the same as in the case of Table 2 on page 7. As the number of jobs running on a throughput benchmark increases, the elapsed time increases. The patterns observed for each of the cases in our previous applications are reflected here as well. 8 Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3

Table 4 Average elapsed timings for the AMBER 7 application Number of jobs Average elapsed time in seconds Case I Case II Case III Case IV Case V Case VI 1 574 571 572 571 571 572 2 574 576 575 574 577 585 4 574 577 576 577 577 596 8 592 619 575 575 587 644 10 710 880 579 580 589 882 16 1197 1207 743 747 772 1265 The percentage differences observed for AMBER 7 are similar to Gaussian 03 (see Table 5). We see that for case II, the percentage difference for throughput jobs running 10 and 16 instances of this sequential input shows slowdowns of 24 and 1, respectively. However, when virtual processors are available, we see an improvement in performance for cases III, IV, and V. This is reflected in the negative numbers for throughput runs with 10 and 16 jobs. Table 5 Percentage difference for cases II through VI when compared to case I for AMBER 7 application Number of jobs % Percentage difference Case I Case II Case III Case IV Case V Case VI 1 0-1 0-1 -1 0 2 0 0 0 0 1 2 4 0 1 0 1 1 4 8 0 5-3 -3-1 9 10 0 24-18 -18-17 24 16 0 1-38 -38-35 6 The last application that we tested corresponds to BLAST, which is different from the two previous applications. BLAST reads a database through mmap and does a pattern search. The BLAST family of tools to search for similarities in pair sequences is developed at the National Center for Biotechnology Information (NCBI). BLAST is part of the development of software tools for analyzing genome data. The BLAST family set of tools is capable of searching databases regardless of whether the query is a sequence of amino acids or nucleotides. Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3 9

BLAST uses a heuristic algorithm to carry out local alignments. This type of alignment or search can be carried out with different programs available in BLAST. Table 6 illustrates the programs available in BLAST. Table 6 BLAST programs 1 Programs Query Database blastp amino acid protein blastn nucleotide nucleotide blastx nucleotide translated protein tblastn amino acid nucleotide translated tblastx nucleotide translated nucleotide translated BLAST can be considered as a three-step algorithm [14]: in step1, the program compiles a list of high-scoring strings; in step 2, the program searches for hits, where for each successful hit it generates a seed; and in step 3, it extends the seeds. The version of BLAST we used is BLAST 2.2.6 [13]. NCBI BLAST has a wrapper called blastall. This wrapper then calls each of the programs in Table 6. Throughout this work, we invoked blastn. We used the gi 5706771 gb AC007518.16 AC007518 Mus musculus chromosome 6 clone 345_D_4 map6 as our query. The database used is the human genome DNA sequence from the Sander center s ensemble server [15]. The version used in this work contains 44521 sequences and 3200338544 letters. Table 7 Average elapsed times for the BLAST application Number of jobs Average elapsed time in seconds Case I Case II Case III Case IV Case V Case VI 1 458 480 484 461 460 461 2 461 458 462 462 463 487 4 464 475 481 469 469 498 8 511 528 489 492 513 550 10 611 776 510 511 526 833 16 989 2003 670 693 697 2405 1 For more information, see reference 13. 10 Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3

Similarly as before, Table 7 on page 10 summarizes the elapsed time as a function of the cases selected in this study. The trends are similar as those shown previously. However, as we shall see from Table 8, BLAST has a larger performance difference when compared to Gaussian 03 and AMBER 7. Table 8 Percentage difference for cases II through VI when compared to case I for the BLAST application Number of jobs % Percentage difference Case I Case II Case III Case IV Case V Case VI 1 0 5 6 1 0 1 2 0-1 0 0 0 6 4 0 2 4 1 1 7 8 0 3-4 -4 0 8 10 0 27-17 -16-14 36 16 0 103-32 -30-30 143 In the case of BLAST, Table 8 shows that BLAST experiences larger differences when running 10 or 16 jobs in a throughput mode. However, it shows similar benefits when running with an available pool of shared virtual processors. The larger difference seen in case II and case VI with 10 and 16 jobs is characteristic of BLAST when compared against the two previous applications. In the case of Gaussian 03 or AMBER 7, the largest difference for 16 jobs in any of the cases tested was approximately 6. However, in the case of BLAST, as shown in Table 8, the corresponding value for case VI with 16 jobs is one order of magnitude larger. These unusually large deviations from the baseline when there are no virtual processors have been discussed previously [16]. Here, we mention the main reason. AIX tends to add additional system time when executing an mmap. The system time, of course, is reflected in the elapsed time. This explains the factor of a 2 or 3 larger deviation when compared to the other two applications. Summary In this study, we have tried to provide information about the performance of a series of throughput benchmarks on a system that has been configured with Micro-Partitioning. To show the benefit of Micro-Partitioning, we ran our set of benchmarks with and without Micro-Partitioning. In other words, we tried to simulate a cluster of multiprocessor workstations that cannot make use of Micro-Partitioning versus a shared-memory system with Micro-Partitioning. A Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3 11

shared-memory system with logical partitions can simulate a cluster of multiprocessor workstations with the added benefit of Micro-Partitioning. This series of benchmarks have clearly shown that for three different applications in Life Sciences, the availability of a pool of virtual processors improves the time to solution. A partition that has exhausted its resources can take advantage of a pool of shared virtual processors provided that they are not required by other partitions. References 1. Advanced POWER Virtualization on IBM Eserver p5 Servers: Introduction and Basic Configuration, SG24-7940 2. Browning, L., IBM Eserver p5 AIX 5L Support for Micro-Partitioning and Simultaneous Multi-threading, July 2004, available at: http://www.ibm.com/servers/aix/whitepapers/aix_support.pdf 3. Tsao, H-F. and B. Olszewski, IBM Eserver p5 570 Server Consolidation Using POWER5 Virtualization White Paper, July 2004, available at: http://callisto.bstoke.uk.ibm.com/unixsolutions/white/p5consol.pdf 4. Gaussian 03, Revision C.01, M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheeseman, J. A. Montgomery, Jr., T. Vreven, K. N. Kudin, J. C. Burant, J. M. Millam, S. S. Iyengar, J. Tomasi, V. Barone, B. Mennucci, M. Cossi, G. Scalmani, N. Rega, G. A. Petersson, H. Nakatsuji, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, M. Klene, X. Li, J. E. Knox, H. P. Hratchian, J. B. Cross, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, P. Y. Ayala, K. Morokuma, G. A. Voth, P. Salvador, J. J. Dannenberg, V. G. Zakrzewski, S. Dapprich, A. D. Daniels, M. C. Strain, O. Farkas, D. K. Malick, A. D. Rabuck, K. Raghavachari, J. B. Foresman, J. V. Ortiz, Q. Cui, A. G. Baboul, S. Clifford, J. Cioslowski, B. B. Stefanov, G. Liu, A. Liashenko, P. Piskorz, I. Komaromi, R. L. Martin, D. J. Fox, T. Keith, M. A. Al-Laham, C. Y. Peng, A. Nanayakkara, M. Challacombe, P. M. W. Gill, B. Johnson, W. Chen, M. W. Wong, C. Gonzalez, and J. A. Pople, Gaussian, Inc., Wallingford CT, 2004 5. Frisch, A. E. and M. J. Frisch, Gaussian 03 User s Reference, 2nd Edition, Gaussian, Inc., available at: http://www.gaussian.com 6. Hehre et al., Ab Initio Molecular Orbital Theory, Wiley-Interscience, 1986, ISBN 0471812412 7. Sosa, C. P., et al., Ab-initio quantum chemistry on a cc-numa architecture using OpenMP, Parallel Computing, 26, pages 843-856, 2000 12 Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3

8. Sosa, C. P. and S. Andersson, Gaussian benchmarks put the pseries 690 server through its paces, February 2002, available at: http://www.ibm.com/servers/esdd/articles/gauss_bench/index.html 9. Ab Initio Quantum Chemistry on the IBM pseries 690: A Comparison Between Turbo 1.3 GHz and Turbo 1.1 GHz, REDP-0444 10.Sosa, C. P., B. V. Atyam, and J. Hilsabeck, in preparation 11.Case et al., AMBER 7 User s Manual, University of California 12.For more information about AMBER on IBM systems, visit: http://www.msi.umn.edu/~cpsosa/chemapps/molmech/amber/amber.html 13.Altschul, S. F., et al., Basic Local Alignment Search Tool, Journal of Molecular Biology, Volume 215, Issue 3, pages 403-410, October 5, 1990 14.Setubal, C.,and J. Meidanis, Introduction to Computational Molecular Biology, PWS Publishing, 1997, ISBN 0534952623 15.Sanger Institute Human Genome Server, available at: http://www.ensembl.org/homo_sapiens/ 16.BLAST Throughput Benchmarks: mmap versus read, REDP-3692 The team that wrote this Redpaper This Redpaper was produced by a team of specialists. Carlos P. Sosa (IBM and University of Minnesota Supercomputing Institute, IBM Eserver Solutions Enablement) is a Senior Technical Staff Member in the Systems Group, where he has been a member of the IBM Chemistry and Life Sciences high-performance effort since 2001. For the last 18 years, he has focused on scientific applications with an emphasis in Life Sciences, parallel programming, benchmarking, and performance tuning. Carlos received a Ph.D. degree in Physical Chemistry from Wayne State University. He completed his post-doctoral work at the Pacific Northwest National Laboratory. His research interests are in the area of new pseries architectures, Blue Gene, and molecular cell biology. He is currently working with researchers at the University of Minnesota trying to classify the vertebrate secretome. Balaji V. Atyam is a Senior Software Engineer in the Systems and Technology Group since 2000. His responsibilities are porting, benchmarking, performance tuning, parallel programming, and technical consulting services to key independent software vendors (ISV) in the area of High Performance Computing on IBM Eserver. He received his Ph.D. in Applied Mathematics from Indian Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3 13

Institute of Technology, Roorkee, India. He was a Scientist/Engineer in Indian Space Research Organization (ISRO) prior to joining IBM. Peter Heyrman is a Senior Technical Staff Member in Rochester, MN. He has worked at IBM for 24 years. He previously worked on iseries TPC-C performance and currently works on the IBM Eserver Hypervisor. Naresh Nayar is a Senior Technical Staff Member with the Systems and Technology Group at IBM in Rochester, MN. He joined IBM in 1992 and has worked on many i5/os kernel projects with a focus on synchronization primitives and task dispatching. His most recent work has been in the area of LPAR for iseries and pseries systems, and he holds numerous patents in the area of partitioning and kernel design. He holds a Bachelor of Technology degree in Electrical Engineering from the Indian Institute of Technology, New Delhi, India, and M.S. and Ph.D. degrees in Computer Science from Iowa State University. Jeri Hilsabeck is the Manager of Integrated and Sector Solutions Enablement. She has 19 years of experience in the computer industry. She holds a B.A. in Computer Science from The University of Texas at Austin. Prior to nine years of management, her area of expertise was in development of the AIX operating system. Thanks to the following people for their contributions to this project: CPS would like to give special thanks to Sam Ellis from IBM Rochester for facilitating our interaction with the Hypervisor team at IBM Rochester (P. Heyrman and N. Nayar). We also would like to thank Scott Vetter for his help and the use of Figure 1 on page 3, which is part of the IBM Redbook Advanced POWER Virtualization on IBM Eserver p5 Servers: Introduction and Basic Configuration, SG24-7940. We thank Dr. Joel Tendler for valuable discussions and suggestions about how to present some of our results. CPS would like to thank Bruce Hurley for encouraging and supporting this effort. 14 Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3

Notices This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-ibm product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-ibm Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurement may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-ibm products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-ibm products. Questions on the capabilities of non-ibm products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. Copyright IBM Corp. 2005. All rights reserved. 15

COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces. Send us your comments in one of the following ways: Use the online Contact us review redbook form found at: ibm.com/redbooks Send your comments in an email to: redbook@us.ibm.com Mail your comments to: IBM Corporation, International Technical Support Organization Dept. JN9B Building 905, 11501 Burnet Road Austin, Texas 78758-3493 U.S.A. Trademarks The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both: AIX 5L AIX Eserver Hypervisor ibm.com IBM iseries Micro-Partitioning POWER4 POWER5 PowerPC POWER pseries Redbooks (logo) TotalStorage Other company, product, and service names may be trademarks or service marks of others. 16 Life Sciences Applications on IBM POWER5 and AIX 5L Version 5.3