AASPI SOFTWARE PARALLELIZATION



Similar documents
LSKA 2010 Survey Report Job Scheduler

CONNECTING TO DEPARTMENT OF COMPUTER SCIENCE SERVERS BOTH FROM ON AND OFF CAMPUS USING TUNNELING, PuTTY, AND VNC Client Utilities

Installing and running COMSOL on a Linux cluster

Running COMSOL in parallel

Deploying Microsoft Operations Manager with the BIG-IP system and icontrol

Grid Scheduling Dictionary of Terms and Keywords

1.0. User Manual For HPC Cluster at GIKI. Volume. Ghulam Ishaq Khan Institute of Engineering Sciences & Technology

Virtual Private Network (VPN)

Parallel Computing with Mathematica UVACSE Short Course

Grid Engine Basics. Table of Contents. Grid Engine Basics Version 1. (Formerly: Sun Grid Engine)

Streamline Computing Linux Cluster User Training. ( Nottingham University)

Installing ASEBA on Ubuntu in VirtualBox

NMS300 Network Management System

Perceptive Interact for Microsoft Dynamics CRM

MATLAB Distributed Computing Server Installation Guide. R2012a

Cloud Bursting with SLURM and Bright Cluster Manager. Martijn de Vries CTO

How to Run Parallel Jobs Efficiently

Matlab on a Supercomputer

Log Analyzer Reference

Intellicus Enterprise Reporting and BI Platform

HIPAA Compliance Use Case

Performance Monitor. Intellicus Web-based Reporting Suite Version 4.5. Enterprise Professional Smart Developer Smart Viewer

High-Performance Reservoir Risk Assessment (Jacta Cluster)

An Introduction to High Performance Computing in the Department

Working with HPC and HTC Apps. Abhinav Thota Research Technologies Indiana University

Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:

Installing and Configuring Windows Server Module Overview 14/05/2013. Lesson 1: Planning Windows Server 2008 Installation.

VMware Server 2.0 Essentials. Virtualization Deployment and Management

The RWTH Compute Cluster Environment

CycleServer Grid Engine Support Install Guide. version 1.25

Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform

Backup and Restore the HPOM for Windows 8.16 Management Server

NEC HPC-Linux-Cluster

SOA Software API Gateway Appliance 7.1.x Administration Guide

Using a Remote SQL Server Best Practices

Introduction to Supercomputing with Janus

IceWarp Server. Log Analyzer. Version 10

- An Essential Building Block for Stable and Reliable Compute Clusters

Using WestGrid. Patrick Mann, Manager, Technical Operations Jan.15, 2014

locuz.com HPC App Portal V2.0 DATASHEET

Overlapping Data Transfer With Application Execution on Clusters

Installation and Configuration Guide for Windows and Linux

DiskBoss. File & Disk Manager. Version 2.0. Dec Flexense Ltd. info@flexense.com. File Integrity Monitor

IBM LoadLeveler for Linux delivers job scheduling for IBM pseries and IBM xseries platforms running Linux

Chapter 2: Getting Started

Server Manager Performance Monitor. Server Manager Diagnostics Page. . Information. . Audit Success. . Audit Failure

OroTimesheet 7 Installation Guide

PARALLELS SERVER 4 BARE METAL README

LS-DYNA Scalability on Cray Supercomputers. Tin-Ting Zhu, Cray Inc. Jason Wang, Livermore Software Technology Corp.

Install FileZilla Client. Connecting to an FTP server

Cluster, Grid, Cloud Concepts

Parallel Processing using the LOTUS cluster

INPS Remote Vision MIQUEST Training Manual

How To Install Linux Titan

Gateway Portal Load Balancing

User's Guide - Beta 1 Draft

Remote & Collaborative Visualization. Texas Advanced Compu1ng Center

MPI / ClusterTools Update and Plans

High Performance Computing. Course Notes HPC Fundamentals

Server & Workstation Installation of Client Profiles for Windows

Remote Access Instructions

Resource Scheduler (Meeting Room Manager) User s Manual

4cast Server Specification and Installation

IP Interface for the Somfy Digital Network (SDN) & RS485 URTSII

Stage One - Applying For an Assent Remote Access Login

How To Install An Aneka Cloud On A Windows 7 Computer (For Free)

Advanced Event Viewer Manual

INTEL PARALLEL STUDIO XE EVALUATION GUIDE

Talend Real-Time Big Data Sandbox. Big Data Insights Cookbook

S m a r t M a s t e B T E C O R P O R A T I O N USER MANUAL

How to Use NoMachine 4.4

HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014

DiskPulse DISK CHANGE MONITOR

LOCKSS on LINUX. Installation Manual and the OpenBSD Transition 02/17/2011

Installation and Configuration Guide for Windows and Linux

Debugging with TotalView

A High Performance Computing Scheduling and Resource Management Primer

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

Computing in High- Energy-Physics: How Virtualization meets the Grid

Intellicus Cluster and Load Balancing- Linux. Version: 7.3

HPC Wales Skills Academy Course Catalogue 2015

Quick Start Guide.

Arena Tutorial 1. Installation STUDENT 2. Overall Features of Arena

Configuring Color Access on the WorkCentre 7120 Using Microsoft Active Directory Customer Tip

Extending Remote Desktop for Large Installations. Distributed Package Installs

Load Balancing in cloud computing

High Performance Computing Facility Specifications, Policies and Usage. Supercomputer Project. Bibliotheca Alexandrina

Defender Configuring for Use with GrIDsure Tokens

VPS Hosting. The Guide to Bet Angel VPS. Getting started with Bet Angel VPS. Revised August Page 1

Intellicus Web-based Reporting Suite

How To Set Up An Intellicus Cluster Server On Ubuntu (Amd64) With A Powerup.Org (Amd86) And Powerup (Amd76) (Amd85) (Powerup) (Net

Rapid Assessment Key User Manual

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

Oracle EXAM - 1Z Oracle Weblogic Server 11g: System Administration I. Buy Full Product.

Transcription:

AASPI SOFTWARE PARALLELIZATION Introduction Generation of multitrace and multispectral seismic attributes can be computationally intensive. For example, each input seismic trace may generate 50 or more complex spectral components. Computing long wavelength curvature at a given voxel may include a window of 21 traces by 21 trace by 81 vertical samples. Our assumption is that our sponsors have access to commercial software that provides fast, but perhaps less accurate algorithms. For example, one of the larger but less expensive commercial packages computes coherence along time slices, rather than along structural dip. Another, high end more expensive system allows such computation along structure, but not as the default. In our implementation, we assume the user has access to multiple processors, whether it be on his or her desktop, on a cluster down the hall, or on a remote supercomputer. For this reason, almost all of our algorithms have been parallelized. Exceptions include high input/output to computation ratio programs such as data slicing and plotting. The AASPI software is built on the Message Passing Interface (OpenMPI). The executables do not require any cost except installation of the proper run time libraries, which we include in the distribution. If you wish to recompile, OpenMPI is free on most Linux clusters. Windows has discontinued support of OpenMPI; in this case we advise purchasing the Intel f90 compiler which includes INTELMPI Parallelization Flow Chart We use three different parallelization structures those for multitrace attributes computed using a computational stencil, single trace attributes such as spectral decomposition, prestack migration/demigration. Parallelization of multitrace attributes All of the volumetric attributes (e.g. dip3d, filter_single_attribute, similarity3d, sof3d, curvature3d, glcm3d, disorder, and all of the poststack image processing algorithms are computed on a rectilinear grid, requiring padding of the data when converting from SEGY to AASPI format. For these algorithms, each live seismic sample requires the same amount of work. In the figure below, the seismic data are sent by the master to n slaves. Upon completion, the slaves Attribute-Assisted Seismic Processing and Interpretation - 10 December 2015 Page 1 1

send the data back to the master, which writes the results out to disk. The processors are synchronized after the completion of each seismic line. Parallelization of single trace and single gather attributes You will note that if there are large no permit zones in your padded data volume, some of the processors will remain idle. For this reason, we use a different strategy when computing single trace attributes, such as spectral decomposition programs spec_cmp, spec_cwt, spec_clssa, and spectral_probe. Note that matching pursuit spec_cmp and mp_nmo are a trace- or gatherindependent nonlinear process whose computation time depends on the seismic waveforms. For Attribute-Assisted Seismic Processing and Interpretation - 10 December 2015 Page 2 2

trace- and gather-parallelized algorithms, the parallelization is synchronized after a fixed number of traces per block (default = 10,000 traces). Attribute-Assisted Seismic Processing and Interpretation - 10 December 2015 Page 3 3

Parallelization of migration and demigration algorithms Currently (Fall, 2015) we parallelize over output lines and input blocks of seismic traces. Unfortunately, this process becomes inefficient when the seismic lines are so long that a single line of CRP gathers no longer fits in memory. For this reason, Yuji Kim is reworking the parallelization strategy to be limited by output blocks of CRPs as well as output lines. The flow chart will follow shortly. The Parallelization Tab common to AASPI GUIs Parallel computation on Linux and Windows standalone machines and clusters All AASPI GUIs use the same parallelization tab. As an example, we display the one shown in the aaspi_dip3d GUI: The (1) number of processors per node is read from the aaspi_default_parameters file, ideally set up to read reasonable values on your system (see that part of the documentation). If you do not know the number of processors on your local host, simply (2) click Determine Maximum Processors on localhost. I am running on a machine called tripolite. (The default local host name is localhost, which is also a valid entry on line 3. When I click this button I find that there are 24 processors. Attribute-Assisted Seismic Processing and Interpretation - 10 December 2015 Page 4 4

Since I am running on a shared system, I don t wish to use all 24 processors, or my colleagues will be angry with me. Instead, I (3) submit the job over three computers (or nodes). Specifically, I will use 12 processors on tripolite, 8 processors on jade and 8 processors on hematite. They syntax reads tripolite:12 jade:8 hematite:8 where the number of processors per node is separated by the node name with a colon, and the node names are separated by spaces. When the job is submitted, it will run over 12+8+8=28 processors, and (if the network is sufficiently fast) complete approximately 28 times faster than on a single node system. Such parallelization is easy to set up on Linux. For Windows, it will require a local area network and a reasonably fast communication backbone. In both cases, your Information Technology staff will need to initiate passwordless access for your user login ID to the target machine IP addresses or Attribute-Assisted Seismic Processing and Interpretation - 10 December 2015 Page 5 5

machine names (a fairly simple process). Default for both machines is to require a password each time someone accesses machine X from machine Y to avoid unwarranted access. Parallel computation on Linux Supercomputers running LSF Several of our larger sponsors have access to supercomputers purchased for reservoir simulation and prestack migration. Seismic attributes computed on very large 3D marine surveys can also take considerable time. As of November, 2015, sponsors have run our software using upwards of 1,500 processors. Supercomputers are not generally used for interactive jobs, but rather large, batch jobs. There are currently two popular batch job submission systems, LSF and PBS. The Load Sharing Facility (LSF) is the batch system we use here on the OU supercomputer, called OSCer. In our case, AASPI has purchased a subset of dedicated condominium nodes, that forms part of the supercomputer system. When we submit a job, we go to the head of the line and other users page out. However, when we are not using the compute cycles, others in our system are free to use them (cycles cannot be warehoused!). Since the computer is about one mile from our building and is housed in the weather center, most of these computer cycles are used in severe weather modeling. I click the tab Build an LSF Script and obtain the following GUI: Note that (4) the tab now reads Build an LSF Script. In my AASPI_default_parameters file I had previously set the (10) maximum number of hours I will run, (6) the name of the batch queue, and (8) the maximum number of processors on that queue. Yes, the name of the queue was called kurt_q so that my students can say kurt has crashed or kurt can t get up or kurt is down again and still sound professional. While this queue has only 144 processors, others have 1000. Let s assume I request all 1000 processors. Examining the parallelization strategy for geometric attributes shown in the first flow chart above, you may notice that if I have only 550 cdps per line Attribute-Assisted Seismic Processing and Interpretation - 10 December 2015 Page 6 6

in a 3D survey, 450 of the processors would be reserved from other use, but remain idle. For this reason, the GUI has (8) the Determine Optimum Number of Batch Processors button, which shows (for the 144 processor example and the Boonsville survey where each seismic line has 133 cdps): Note that largest number of processors that can be used is 133. If I wish to allow others to use the system, the next optimum value is 67 processors, running twice a s long as if I used 133 processors, but freeing up 144-67=77 processors for other folks to use. Chosing this value I obtain the following image: Note that (9) the number 67 now appears after the Number of LSF processors to use. Attribute-Assisted Seismic Processing and Interpretation - 10 December 2015 Page 7 7

When I click Execute dip3d on the main tab, the GUI generates a file called dip3d_boonsville.bsub and then submits it. Let s look at this file by typing cat dip3d_boonsville.bsub: This is the required batch job deck on our supercomputer system. We have not yet figured out how to string together multiple batch jobs to generate a workflow. Parallel computation on Linux Supercomputers running PBS The Portable Batch System (PBS) is used by at least three of our sponsors on their supercomputer system. Unlike LSF, we do not have this software here at OU. Also, unlike LSF, there will be one Attribute-Assisted Seismic Processing and Interpretation - 10 December 2015 Page 8 8

or two system specific components of the submission deck that needs to be tweaked by your IT folks. This will involve a one-time modification of a file used in building the job deck. The computational values used in this option are identical to those in using LSF. The output job deck is now called dip3d_boonsville.qsub. Attribute-Assisted Seismic Processing and Interpretation - 10 December 2015 Page 9 9