WILLIAM B. HURST, PH.D.321 Madison Pl



Similar documents
Operating Systems OBJECTIVES 7.1 DEFINITION. Chapter 7. Note:

Application of Predictive Analytics for Better Alignment of Business and IT

HPC performance applications on Virtual Clusters

Windows Server Performance Monitoring

Real Time Network Server Monitoring using Smartphone with Dynamic Load Balancing

Rackspace Cloud Databases and Container-based Virtualization

Program Grid and HPC5+ workshop

Efficient Load Balancing using VM Migration by QEMU-KVM

Performance Workload Design

Multifaceted Resource Management for Dealing with Heterogeneous Workloads in Virtualized Data Centers

Resource Utilization of Middleware Components in Embedded Systems

CloudCmp:Comparing Cloud Providers. Raja Abhinay Moparthi

Understanding Linux on z/vm Steal Time

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Cisco Integrated Services Routers Performance Overview

A highly configurable and efficient simulator for job schedulers on supercomputers

Performance Modeling for Web based J2EE and.net Applications

AS-D1 SIMULATION: A KEY TO CALL CENTER MANAGEMENT. Rupesh Chokshi Project Manager

1: B asic S imu lati on Modeling

QLIKVIEW ARCHITECTURE AND SYSTEM RESOURCE USAGE

Week Overview. Installing Linux Linux on your Desktop Virtualization Basic Linux system administration

Mobile Storage and Search Engine of Information Oriented to Food Cloud

Cloud Computing through Virtualization and HPC technologies

CS423 Spring 2015 MP4: Dynamic Load Balancer Due April 27 th at 9:00 am 2015

Optimizing Shared Resource Contention in HPC Clusters

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010

Chapter 2: Getting Started

Module 15: Monitoring

11.1 inspectit inspectit

Energy-aware job scheduler for highperformance

EView/400i Management Pack for Systems Center Operations Manager (SCOM)

Basic Queuing Relationships

Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator

Load Testing Analysis Services Gerhard Brückl

1. Simulation of load balancing in a cloud computing environment using OMNET

Performance testing as a full life cycle activity. Julian Harty

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

PLA 7 WAYS TO USE LOG DATA FOR PROACTIVE PERFORMANCE MONITORING. [ WhitePaper ]

Software design (Cont.)

Distributed Computing and Big Data: Hadoop and MapReduce

IBM Power Systems This is Power on a Smarter Planet

A Scalable Network Monitoring and Bandwidth Throttling System for Cloud Computing

HPC Wales Skills Academy Course Catalogue 2015

MEng, BSc Applied Computer Science

Performance Test Process

Enterprise Applications in the Cloud: Non-virtualized Deployment

Designing Real-Time and Embedded Systems with the COMET/UML method

Understanding the Benefits of IBM SPSS Statistics Server

Final Report. Cluster Scheduling. Submitted by: Priti Lohani

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

WAIT-TIME ANALYSIS METHOD: NEW BEST PRACTICE FOR APPLICATION PERFORMANCE MANAGEMENT

Improving Compute Farm Throughput in Electronic Design Automation (EDA) Solutions

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database

Mitglied der Helmholtz-Gemeinschaft. System monitoring with LLview and the Parallel Tools Platform

Optimizing Storage for Better TCO in Oracle Environments. Part 1: Management INFOSTOR. Executive Brief

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Eastern Washington University Department of Computer Science. Questionnaire for Prospective Masters in Computer Science Students

Oracle Big Data SQL Technical Update

MEng, BSc Computer Science with Artificial Intelligence

TECHNICAL BRIEF. Primary Storage Compression with Storage Foundation 6.0

AN AIR TRAFFIC MANAGEMENT (ATM) SYSTEM PERFORMANCE MODEL ETMS

Introduction to the NI Real-Time Hypervisor

Data and Machine Architecture for the Data Science Lab Workflow Development, Testing, and Production for Model Training, Evaluation, and Deployment

Parallel Large-Scale Visualization

Analysis on Virtualization Technologies in Cloud

Bernie Velivis President, Performax Inc

TRACE PERFORMANCE TESTING APPROACH. Overview. Approach. Flow. Attributes

Maintaining Non-Stop Services with Multi Layer Monitoring

Grid Scheduling Dictionary of Terms and Keywords

Ecole des Mines de Nantes. Journée Thématique Emergente "aspects énergétiques du calcul"

A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster

A Web Performance Testing Model based on Accessing Characteristics

Essentials Guide CONSIDERATIONS FOR SELECTING ALL-FLASH STORAGE ARRAYS

Effective Java Programming. efficient software development

High Performance Computing in CST STUDIO SUITE

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

MAQAO Performance Analysis and Optimization Tool

An Oracle White Paper July Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide

Parallel Firewalls on General-Purpose Graphics Processing Units

Part V Applications. What is cloud computing? SaaS has been around for awhile. Cloud Computing: General concepts

RevoScaleR Speed and Scalability

Basic Unix/Linux 1. Software Testing Interview Prep

McAfee Enterprise Mobility Management Performance and Scalability Guide

GHG Protocol Product Life Cycle Accounting and Reporting Standard ICT Sector Guidance. Chapter 7:

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE

A High Performance Computing Scheduling and Resource Management Primer

Technical Writing - A Practical Case Study on ehl 2004r3 Scalability testing

Case Study I: A Database Service

Final Project Report. Trading Platform Server

Performance And Scalability In Oracle9i And SQL Server 2000

Computing at the HL-LHC

Question: 3 When using Application Intelligence, Server Time may be defined as.

Decentralized Task-Aware Scheduling for Data Center Networks

Delivering Quality in Software Performance and Scalability Testing

Operating Systems. Module Descriptor

White Paper. Real-time Capabilities for Linux SGI REACT Real-Time for Linux

Software Performance and Scalability

Tableau Server 7.0 scalability

Transcription:

WILLIAM B. HURST, PH.D.321 Madison Pl Benton, AR 72015 T 501-993-1459 Bwbhurst411@gmail.com Current Research Projects Statement of Research Interests My most recent research was focused on High Performance Computing (HPC) systems. The problem I endeavored to solve involved the difficulties facing potential and existing HPC owners. After encountering the problems of evaluating the seemingly endless combinations of hardware components and software configurations that become aggregated into an HPC system, it became clear to me that these evaluations provided a significant purchasing obstacle to any prospective buyer. While it is obvious that the goal of every potential buyer is to get the best system possible while minimizing expenses, a method for determining the optimal system for a potential buyer provides a considerable degree of confusion. Add to this confusion mounting concerns for system expandability for servicing a growing customer base and trepidation begins to flood the buyer. The question is How does a potential buyer make the most intelligent decision on their future HPC purchase? My solution was to develop a freely accessible HPC simulator which would enable a potential buyer to test the impacts from different hardware components on the performance of an HPC system. The idea must have been a good one, because today there are many other researchers working towards the creation of the perfect HPC simulator; all of us in effect racing to a better solution. While we are all still improving our different simulator versions, we have not yet reached a final solution. There are still a great many improvements that can be made. Despite the fact that there is more work to be done, my last set of HPC system vs. HPC system simulator test runs were extremely successful. Utilizing job performance data from an existing live HPC system the simulator was able to predict the performance of a non-existent HPC system to such a high degree of accuracy as to achieve results that were statistically equivalent to the live HPC system, when it did exist. Now that this has been achieved, it is possible to test the impacts of different hardware component combinations upon the performance results. The accuracy of the HPC system simulator performance predictions for a non-existent 256 CPU system can be seen in the figures below, but they are perhaps best quantified through Table I. In each of the diagrams below, one can see a single instance of the data results from the live system experiment and three instances of the simulated system results. The variations

2 of the simulated systems are related to job scheduling variations within the simulations. One job scheduling simulation utilized the First-In-First-Out (FIFO) protocol, a second simulation set involved the prioritization of jobs based on the numbers of CPUs requested within the job (Priority FIFO), and a third simulation data set utilized a mathematical model best described through the Kendall notation identified as M/M/s Priority FIFO. This can be described as jobs arriving according to a Markovian distribution, with each of the jobs requiring service times according to a Markovian distribution, on an HPC system having s number of servers. Fig. 1. Results: 256 CPU HPC System Comparisons (a) Number of Jobs (b) Inter-Arrival Time (c) Queuing Time (d) Processing Time (e) Service Time (f) Arrival Rate

3 Fig. 2. Results: 256 CPU HPC System Comparisons (Continued...) (a) Intensity (b) Utilization TABLE I SIMULATOR PREDICTIVE ACCURACY Job HPC MMs Diff Pcnt Class Live Sim Error 1 126.55 135.30 8.91 6.91 2 252.40 256.81 4.41 1.75 3 456.59 457.56 0.97 0.21 4 597.13 598.17 1.04 0.17 5 960.60 963.85 3.25 0.34 6 1205.05 1205.82 0.77 0.06 7 1603.72 1608.21 4.49 0.28 8 1822.52 1829.57 7.05 0.39 9 1935.04 1959.85 24.81 1.28 10 2116.33 2120.31 3.98 0.19 11 2139.15 2137.77 1.38 0.06 12 2232.66 2236.57 3.91 0.18 13 2383.29 2382.34 0.95 0.04 14 2434.91 2442.27 7.36 0.30 15 2501.33 2504.16 2.83 0.11 16 2531.24 2627.67 96.43 3.81

4 Among some of the improvements that yet need to be made: Processing overhead added to simulations Resource management polling period Job backfilling Additional constraints of disk, swap and memory added to job scheduler parameters list The relevance of the work is that now that the HPC simulator has been created, you like others, can use the software to re-evaluate your existing systems, to evaluate future needs, to predict the performance of an existing system that has been expanded using similar hardware components, or to test the impacts of non-existent hardware designs; for example, unique combinations of multi-core CPUs, and GPGPUs that have not yet been designed. Research Project Plans Looking beyond the list of needed improvements to the simulator mentioned above, it is worth noting that this simulator was created with a full utilization of the definition of simulation as: Simulation: the technique of representing the real world by a computer program; a simulation should imitate the internal processes and not merely the results [1] Embedded within the simulator are concepts of operating system models, ready queues, cpu processing, I/O blocking, networking between machines, switches/router processing, client classes, a HPC head node and job scheduling. It is through this mirroring of live system algorithms within the simulator that makes it possible for the application of hardware parameter specifications to the appropriate hardware component simulation. The impacts from these variations can be tested upon system performance. Prior Research Software Bug Predictions through Component Life-Cycle Analysis For my Master degree in Computer Science, I worked with developing bug predictions using component life-cycle analysis. The predictions involved identifying the life-cycles of each of the 31 primary sub-component development threads contained within a single software product. I then proved that once the life-

5 cycles of the product sub-components were identified, predictions about future software bug reports became significantly more accurate. The primary software product used during this research process was libsvm [2] While life-cycle analysis had been performed before on software bug predictions, it had not been applied on a sub-component level using the libsvm software package. The contributions of this project was two-fold: one benefit was the demonstration of using a new software tool to perform bug prediction; and the second benefit was the discovery of the life-cycles for the sub-components within the software product. Text Search Analysis on Web Page Repositories For approximately one month, I researched enterprise level text search methods as they are applied to web page repositories. The experience illuminated for me the difficulties found in returning relevant information to web queries; difficulties that companies like Google and Yahoo face millions of times every day. Interest for this type of activity is of concern to corporations and even countries, because of the extremely large quantities of new data generated every year through web queries, text messages, emails, graphics files, audio files and data transfers. One report from 2002 estimated that it was 5 exabytes. As a results, I concluded that new methods must be found to analyze, qualify, quantify and store these new volumes of data. It seemed to me like only an HPC system would be capable of the computational cycles required to analyze the overwhelming size of raw data generated. This conclusion only reinforced my previous interest in HPC systems. Sincerely, William B. Hurst, Ph.D. REFERENCES [1] wordnetweb.princeton.edu. Definitions of simulation on the web. http://wordnetweb.princeton.edu/ perl/ webwn (2011). [2] Chang, C.-C. & Lin, C.-J. LIBSVM: a Library for Support Vector Machines (2001).