Profiling and Tracing in Linux
|
|
|
- Rudolf King
- 10 years ago
- Views:
Transcription
1 Profiling and Tracing in Linux Sameer Shende Department of Computer and Information Science University of Oregon, Eugene, OR, USA Abstract Profiling and tracing tools can help make application parallelization more effective and identify performance bottlenecks. Profiling presents summary statistics of performance metrics while tracing highlights the temporal aspect of performance variations, showing when and where in the code performance is achieved. A complex challenge is the mapping of performance data gathered during execution to high-level parallel language constructs in the application source code. Presenting performance data in a meaningful way to the user is equally important. This paper presents a brief overview of profiling and tracing tools in the context of Linux - the operating system most commonly used to build clusters of workstations for high performance computing. 1. Introduction Understanding the behavior of parallel programs is a daunting task. Two activities that aid the user in understanding the performance characteristics of his/her program are profiling and tracing. To understand the behavior of the parallel program, we must first make its behavior observable. To do this, we regard the execution of a program as a sequence of actions, each representing some significant activity such as the entry into a routine, the execution of a line of code, a message communication or a barrier synchronization. These actions become observable when they are recorded as events. Thus, an event is an encoded instance of an action [10]. It is the fundamental unit we use in understanding the behavior of a program. More specifically the execution of an event interrupts the program and initiates one of the following responses: the computation is stopped visible attributes of the event are recorded, or statistics are collected. After an event response is completed, control returns to the program. The response to an event depends on its use: debugging; tracing; or profiling. Profiling updates summary statistics of execution when an event occurs. It uses the occurrence of an event to keep track of statistics of performance metrics. These statistics are maintained at runtime as the process executes and are stored in profile data files when the program terminates. Profiles are subsequently analyzed to present the user with aggregate summaries of the metrics. Tracing, on the other hand, records a detailed log of timestamped events and their attributes. It reveals the temporal aspect of the execution. This allows the user to see when and where routine transitions, communication and user defined events take place. Thus, it helps in understanding the behavior of the parallel program. Performance analysis based on profiling and tracing involves three phases: instrumentation or modification of the program to generate performance data, measurement of interesting aspects of execution which generates the performance data and analysis of the performance data. 2. Instrumentation Common to both profiling and tracing is the instrumentation phase where instructions are added to the program to generate performance data. Instrumentation helps in identifying the section of code that is executing when an event is triggered. It is also responsible, during execution, for measuring the resources consumed and mapping these resources to program entities, such as routines and statements. Relating the code segments
2 back to source-code constructs and entities that the user can comprehend is important. As shown in Figure 1., instrumentation can be added to the program at any stage during the compilation/execution process from user-added calls in the source code to a completely automated approach. (Events are triggered by the execution of the instrumentation instructions at runtime.) Moving down the stages from source code instrumentation to runtime instrumentation, the instrumentation mechanism changes from language specific to platform specific. Making an instrumentation API specific to a language for source code instrumentation may make the instrumentation portable across multiple compilers and platforms that support the language. However, the instrumentation may not work with other languages. Targeting the instrumentation at the executable level ensures that applications written in any language that generates an executable would benefit from the instrumentation on the specific platform. However, it is more difficult to map performance data back to source-code entities, especially if the application s highlevel programming language (such as HPF or ZPL), is translated to an intermediate language (such as Fortran or C) during compilation. There is often a trade-off between what level of abstraction the tool can provide and how easily the instrumentation can be added. The JEWEL [19] package requires that calls be added manually to the source code. While providing flexibility, the process can be cumbersome and time consuming. A preprocessor can generate instrumented source code by automatically inserting annotations into the original source. This approach is taken by SvPablo [14] and AIMS [20] for C and Fortran, and by TAU [15] with PDT for C++ [18]. Instrumentation can be added by a compiler as in gprof [5] (with the -pg commandline switch for the GNU compilers). Instrumentation can also be added during linking when the executable image is created by linking multiple object files and libraries together. Packages that use this method include Vampir- Trace [13], and PICL [12] for MPI and PVM message passing libraries. When an inter-process communication call is executed, the instrumented library acts as a wrapper that calls the profiling and tracing routines before and after calling the corresponding uninstrumented library call. This is achieved by weak library bindings. However, often program sources and libraries are unavailable to the users and it is necessary to examine the performance of executable images. Pixie [16] and Atom [17] are tools that rewrite binary executable files while adding the instrumentation, but are not currently available under Linux. Language specific Libraries Platform specific All these approaches require the programer to modify the application, re-compile and re-execute the instrumented application to produce the performance data. Sometimes, re-execution of an application using the performance analysis tool is not a viable alternative while searching for performance bottlenecks. This is true for long-running tasks such as database servers. Paradyne [9] automates the search for bottlenecks in an application at runtime by selectively inserting and deleting instrumentation using the DynInst [7] dynamic instrumentation package. After instrumentation, we execute the program and perform measurements, in the form of profiling and tracing and analyze the data. 3. Profiling Source code Preprocessor Source code Compiler Object code Linker Executable Execution TAU, Jewel TAU/PDT, AIMS, SvPablo gprof VampirTrace, PICL Atom, Pixie Paradyne Figure 1: Instrumentation options range from language specific to platform specific mechanisms. Profiling shows the summary statistics of performance metrics that characterize the performance of an application. Examples of metrics include: the CPU time associated with a routine; the count of the secondary data cache misses associated with a group of statements; the number of times a routine executes; etc. These metrics are typically presented as sorted lists that show the contribution of the routines. Two main approaches to profil-
3 ing include sampled process timing and measured process timing. In profiling based on sampling, a hardware interval timer periodically interrupts the execution. Instead of time, these interrupts could be triggered by CPU performance counters, that measure events in hardware, as well. Commonly employed tools such as prof [4] and gprof [5] use time based sampling. When an interrupt occurs, the state of the program counter (PC) is sampled and a histogram that represents the frequency distribution of PC samples is maintained. The program s image file and symbol table are used post-mortem to calculate the time spent in each routine as the number of samples in a routine s code range times the sampling period. Instead of the PC, the callstack of routines can be sampled too. In this case, a table that maintains the time spent exclusively and inclusively (of called child routines) for each routine, is updated, when an interrupt occurs. Then, the exclusive time of the currently executing routine and the inclusive time of the other routines that are on the callstack are incremented by the interinterrupt time interval. Additionally, in gprof, each time a parent calls a child function, a counter for that parentchild pair is incremented. Gprof then shows the sorted list of functions and their call-graph descendents. Below each function entry are its call-graph descendents, showing how their times are propagated to it. In profiling based on measured process timing, the instrumentation is triggered at routine entry and exit. At these points, a precise timestamp is recorded and performance metrics comprising of timers and counters are updated. TAU [15] uses this approach. TAU s modular profiling and tracing toolkit features support for hardware performance counters, selectively profiling groups of functions and statements, user-defined events and threads. It works with C++, C and Fortran on a variety of platforms including Linux. Its modules can be assembled by the user during the configuration phase and a profiling or tracing library can be tailor-made to the user s specification. Profiles can be analyzed using racy, a GUI, as shown in Figure 2. or pprof, a prof-like utility that can sort and display tables of metrics in text form. 3.1 Hardware Performance Counters A profiling system needs access to an accurate and a high-resolution clock. The Intel Pentium(R) family of processors contains a free-running 64-bit time-stamp counter which increments once every clock cycle. This is accessible as Model Specific Registers built into the processor to profile hardware performance. These registers can also be used to monitor hardware performance Figure 2: TAU s profile browser tool racy counters that record counts of processor specific events such as memory references, cache misses, floating point operations and bus activity. The Intel PentiumPro(R) and Pentium II(R) CPUs have two performance counters available as 32-bit registers that can count at most 66 different events (two at a time). Tools such as pperf [6] and PCL [1] provide access to these registers under the Linux operating system. PCL provides a uniform API, in the form of a library, to access the hardware performance counters on a number of platforms. Under Linux, the counters are not saved on each context switch operation and PCL recommends performing measurements on a lightly loaded system. PerfAPI [2] is another project that will provide a uniform API for accessing hardware performance counters on different platforms, including Linux. Profiling can present the total resources consumed by program level entities. It can show the relative contribution of the routines to the profile of the application and can provide insight in locating performance hotspots. 4. Tracing Typically, profiling shows the distribution of execution time across routines. It can show the code locations associated with specific bottlenecks, but it does not show the temporal aspect of performance variations. The advantage of profiling is that statistics can be maintained during execution and it requires a small amount of storage. Tracing the execution of a parallel program shows when an event occurred and where it occurred, in terms of the location in the source code and the process that executed it. An event is typically represented by an ordered tuple that consists of the event identifier, the timestamp when the event occurred, where it occurred (the location may be specified by the node and thread identifiers) and an optional field of event specific information. In addition to the event-trace, a table that maps
4 Figure 3: Vampir displays TAU traces from a multi-threaded SMARTS application written in C++ the event identifier to an event name and its characteristics is also maintained. Tracing involves storing data associated with an event in a buffer and periodically writing the buffer to stable storage. This involves event logging, timestamping, trace buffer allocation and trace output. The traces can be merged subsequently and corrected for perturbation [10] that may have been caused due to the overhead introduced by the instrumentation. Finally, visualization and analysis of event-traces helps the user understand the behavior of the parallel program. 4.1 Clock Synchronization Tracing a program that executes on a cluster of workstations requires access to an accurate and a globally synchronized real-time clock to order the events accurately with respect to a global time base. Clock synchronization can be implemented in hardware, as in shared memory multiprocessors. It can also be achieved by using software protocols such as the Network Time Protocol [11] which synchronizes the clocks to a within a few milliseconds on a LAN or a WAN, or it can be implemented within the tool environment as in BRISK [19]. For greater accuracy, a GPS satellite receiver may be used to synchronize the system clock accurately to within a few microseconds as described in [8]. Profiling does not require a globally synchronized real-time clock but tracing does. 5. Performance Analysis and Visualization Parallel programs produce vast quantities of multidimensional data. Representing this data without overwhelming the user with unnecessary detail is critical to the tools success. It can be represented effectively using a variety of visualization techniques [14] such as scatter plots, Kiviat diagrams, histograms, Gantt charts, interaction matrices, pie charts, execution graphs and tree displays. ParaGraph [3] is a trace visualization tool that incorporates the above techniques and has inspired performance views in several other tools. It is available under Linux and works with PICL traces. It contains a rich set of visualizations and is extensible by the user. Vampir [13] is a robust, commercial trace visualization tool available under Linux. Figure 3. is a Vampir display of traces generated by TAU. It shows a space-time diagram that highlights when events such as routine transitions take place on different processors. Pablo [14] provides a user directed-analysis of performance data visualization by providing a set of performance data transformation modules that are interconnected graphically to create an acyclic data analysis graph. Performance data flows through this graph and metrics can be
5 computed by different user-defined modules. 6. Conclusion This paper presented a brief overview of some profiling and tracing tools that are available under Linux, and the techniques that they use for instrumentation, measurement and visualization. Linux is increasingly used by the scientific community for parallel simulations on clusters of personal computers. With growing complexity of component based, high performance scientific frameworks [18], tools often encounter a semanticgap. They present performance information that is too low level and does not relate well with source code constructs in the high-level parallel languages used to program these clusters. Unless tools can present performance data in ways that are meaningful to the user, and are consistent with the user s mental model of abstractions, their success will be limited. 7. Acknowledgments This work was supported by the U.S. Government, Department of Energy DOE2000 program #DEFC0398ER References [1]R. Berrendorf, H. Ziegler, PCL - The Performance Counter Library. A Common Interface to Access Hardware Performance Counters on Microprocessors, Technical Report, FZJ-ZAM-IB-9816, Central Institute for Applied Mathematics, Research Centre Juelich GmbH, Oct URL: ReDec/SoftTools/PCL/PCL.html [2]S. Browne, G. Ho, P. Mucci, C. Kerr, Standard API for Accessing Hardware Performance Counters, Poster, SIGMETRICS Symposium on Parallel and Distributed Tools URL: [3]M. Heath, J. Ethridge, Visualizing the performance of parallel programs, IEEE Software, Vol. 8, No. 5, URL: Graph/ParaGraph.htmls [4]S. Graham, P. Jessler, M. McKusick, An Execution Profiler for Modular Programs, Software - Practice and Experience, Vol. 13, pp , [5]S. Graham, P. Kessler, M. McKusick, gprof: A Call Graph Execution Profiler, Proceedings of SIGPLAN 82 Symposium on Compiler Construction, SIGPLAN Notices, Vol. 17, No. 6, pp , June [6]M. Goda, Performance Monitoring for the Pentium and Pentium Pro Under the Linux Operating System,1999.URL: [7]J. Hollingsworth, B. Buck, DyninstAPI Programmer s Guide, URL: projects/dyninstapi/ [8]J. Hollingsworth, B. Miller, Instrumentation and Measurement, in The Grid: Blueprint for a New Computing Infrastructure I. Foster and C. Kesselman (Eds.), Morgan Kaufmann Publishers, San Francisco, pp , 1999 [9]B. Miller, M. Callaghan, J. Cargille, J. Hollingsworth, R. Irvin, K. Karavanic, K. Kunchithapadam and T. Newhall. IEEE Computer 28(11), pp (November 1995).URL: [10]A. D. Malony, Performance Observability, Ph.D. Thesis, University of Illinois, Urbana, Also available as CSRD Report No. 1034, Sept [11]Network Time Protocol, URL: [12]Oak Ridge National Laboratory, Portable Instrumented Communication Library,1999. URL: [13]Pallas GmbH, Vampir - Visualization and Analysis of MPI Resources, URL: vampir.html [14]D. Reed and R. Ribler, Performance Analysis and Visualization, in The Grid: Blueprint for a New Computing Infrastructure I. Foster and C. Kesselman (Eds.), Morgan Kaufmann Publishers, San Francisco, pp , 1999.URL: [15]S. Shende, A. D. Malony, J. Cuny, K. Lindlan, P. Beckman, and S. Karmesin, Portable Profiling and Tracing for Parallel Scientific Applications using C++, Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools, pp , ACM, Aug URL: [16]Silicon Graphics Inc. Speedshop User s Guide, 1999.URL: [17]A. Srivastava, A. Eustace, Atom: A system for building customized program analysis tools, In SIG- PLAN Conf. on Programming Language Design and Implementation, pp , ACM, 1994.URL: [18]The Staff, Advanced Computing Laboratory, Los Alamos National Laboratory, Taming Complexity in High-Performance Computing White paper, Aug Available from URL: [19]A. Waheed, D. Rover, M. Mutka, H. Smith, and A. Bakic, Modeling, Evaluation and Adaptive Control of an Instrumentation System, Proceedings of IEEE Real- Time Technology and Applications Symposium (RTAS 97), pp , June 1997.URL: [20]J. Yan, Performance Tuning with AIMS - An automated Instrumentation and Monitoring System for Multicomputers, Proc. 27th Hawaii Intl. Conf. on System Sciences, Hawaii, Jan URL:
End-user Tools for Application Performance Analysis Using Hardware Counters
1 End-user Tools for Application Performance Analysis Using Hardware Counters K. London, J. Dongarra, S. Moore, P. Mucci, K. Seymour, T. Spencer Abstract One purpose of the end-user tools described in
Review of Performance Analysis Tools for MPI Parallel Programs
Review of Performance Analysis Tools for MPI Parallel Programs Shirley Moore, David Cronk, Kevin London, and Jack Dongarra Computer Science Department, University of Tennessee Knoxville, TN 37996-3450,
Performance Tools for Parallel Java Environments
Performance Tools for Parallel Java Environments Sameer Shende and Allen D. Malony Department of Computer and Information Science, University of Oregon {sameer,malony}@cs.uoregon.edu http://www.cs.uoregon.edu/research/paracomp/tau
How To Visualize Performance Data In A Computer Program
Performance Visualization Tools 1 Performance Visualization Tools Lecture Outline : Following Topics will be discussed Characteristics of Performance Visualization technique Commercial and Public Domain
Basics of VTune Performance Analyzer. Intel Software College. Objectives. VTune Performance Analyzer. Agenda
Objectives At the completion of this module, you will be able to: Understand the intended purpose and usage models supported by the VTune Performance Analyzer. Identify hotspots by drilling down through
A Framework for Online Performance Analysis and Visualization of Large-Scale Parallel Applications
A Framework for Online Performance Analysis and Visualization of Large-Scale Parallel Applications Kai Li, Allen D. Malony, Robert Bell, and Sameer Shende University of Oregon, Eugene, OR 97403 USA, {likai,malony,bertie,sameer}@cs.uoregon.edu
An Implementation of the POMP Performance Monitoring Interface for OpenMP Based on Dynamic Probes
An Implementation of the POMP Performance Monitoring Interface for OpenMP Based on Dynamic Probes Luiz DeRose Bernd Mohr Seetharami Seelam IBM Research Forschungszentrum Jülich University of Texas ACTC
Improving Time to Solution with Automated Performance Analysis
Improving Time to Solution with Automated Performance Analysis Shirley Moore, Felix Wolf, and Jack Dongarra Innovative Computing Laboratory University of Tennessee {shirley,fwolf,dongarra}@cs.utk.edu Bernd
Application Performance Analysis Tools and Techniques
Mitglied der Helmholtz-Gemeinschaft Application Performance Analysis Tools and Techniques 2012-06-27 Christian Rössel Jülich Supercomputing Centre [email protected] EU-US HPC Summer School Dublin
Integrating TAU With Eclipse: A Performance Analysis System in an Integrated Development Environment
Integrating TAU With Eclipse: A Performance Analysis System in an Integrated Development Environment Wyatt Spear, Allen Malony, Alan Morris, Sameer Shende {wspear, malony, amorris, sameer}@cs.uoregon.edu
Performance Monitoring on an HPVM Cluster
Performance Monitoring on an HPVM Cluster Geetanjali Sampemane [email protected] Scott Pakin [email protected] Department of Computer Science University of Illinois at Urbana-Champaign 1304 W Springfield
Data Structure Oriented Monitoring for OpenMP Programs
A Data Structure Oriented Monitoring Environment for Fortran OpenMP Programs Edmond Kereku, Tianchao Li, Michael Gerndt, and Josef Weidendorfer Institut für Informatik, Technische Universität München,
gprof: a Call Graph Execution Profiler 1
gprof A Call Graph Execution Profiler PSD:18-1 gprof: a Call Graph Execution Profiler 1 by Susan L. Graham Peter B. Kessler Marshall K. McKusick Computer Science Division Electrical Engineering and Computer
Performance Monitoring and Visualization of Large-Sized and Multi- Threaded Applications with the Pajé Framework
Performance Monitoring and Visualization of Large-Sized and Multi- Threaded Applications with the Pajé Framework Mehdi Kessis France Télécom R&D {Mehdi.kessis}@rd.francetelecom.com Jean-Marc Vincent Laboratoire
NVIDIA Tools For Profiling And Monitoring. David Goodwin
NVIDIA Tools For Profiling And Monitoring David Goodwin Outline CUDA Profiling and Monitoring Libraries Tools Technologies Directions CScADS Summer 2012 Workshop on Performance Tools for Extreme Scale
A Brief Survery of Linux Performance Engineering. Philip J. Mucci University of Tennessee, Knoxville [email protected]
A Brief Survery of Linux Performance Engineering Philip J. Mucci University of Tennessee, Knoxville [email protected] Overview On chip Hardware Performance Counters Linux Performance Counter Infrastructure
GPU Profiling with AMD CodeXL
GPU Profiling with AMD CodeXL Software Profiling Course Hannes Würfel OUTLINE 1. Motivation 2. GPU Recap 3. OpenCL 4. CodeXL Overview 5. CodeXL Internals 6. CodeXL Profiling 7. CodeXL Debugging 8. Sources
CS 3530 Operating Systems. L02 OS Intro Part 1 Dr. Ken Hoganson
CS 3530 Operating Systems L02 OS Intro Part 1 Dr. Ken Hoganson Chapter 1 Basic Concepts of Operating Systems Computer Systems A computer system consists of two basic types of components: Hardware components,
Reconfigurable Architecture Requirements for Co-Designed Virtual Machines
Reconfigurable Architecture Requirements for Co-Designed Virtual Machines Kenneth B. Kent University of New Brunswick Faculty of Computer Science Fredericton, New Brunswick, Canada [email protected] Micaela Serra
Control 2004, University of Bath, UK, September 2004
Control, University of Bath, UK, September ID- IMPACT OF DEPENDENCY AND LOAD BALANCING IN MULTITHREADING REAL-TIME CONTROL ALGORITHMS M A Hossain and M O Tokhi Department of Computing, The University of
A Pattern-Based Approach to. Automated Application Performance Analysis
A Pattern-Based Approach to Automated Application Performance Analysis Nikhil Bhatia, Shirley Moore, Felix Wolf, and Jack Dongarra Innovative Computing Laboratory University of Tennessee (bhatia, shirley,
Software Tracing of Embedded Linux Systems using LTTng and Tracealyzer. Dr. Johan Kraft, Percepio AB
Software Tracing of Embedded Linux Systems using LTTng and Tracealyzer Dr. Johan Kraft, Percepio AB Debugging embedded software can be a challenging, time-consuming and unpredictable factor in development
11.1 inspectit. 11.1. inspectit
11.1. inspectit Figure 11.1. Overview on the inspectit components [Siegl and Bouillet 2011] 11.1 inspectit The inspectit monitoring tool (website: http://www.inspectit.eu/) has been developed by NovaTec.
find model parameters, to validate models, and to develop inputs for models. c 1994 Raj Jain 7.1
Monitors Monitor: A tool used to observe the activities on a system. Usage: A system programmer may use a monitor to improve software performance. Find frequently used segments of the software. A systems
Application Monitoring in the Grid with GRM and PROVE *
Application Monitoring in the Grid with GRM and PROVE * Zoltán Balaton, Péter Kacsuk and Norbert Podhorszki MTA SZTAKI H-1111 Kende u. 13-17. Budapest, Hungary {balaton, kacsuk, pnorbert}@sztaki.hu Abstract.
MAQAO Performance Analysis and Optimization Tool
MAQAO Performance Analysis and Optimization Tool Andres S. CHARIF-RUBIAL [email protected] Performance Evaluation Team, University of Versailles S-Q-Y http://www.maqao.org VI-HPS 18 th Grenoble 18/22
Performance Measurement of Dynamically Compiled Java Executions
Performance Measurement of Dynamically Compiled Java Executions Tia Newhall and Barton P. Miller University of Wisconsin Madison Madison, WI 53706-1685 USA +1 (608) 262-1204 {newhall,bart}@cs.wisc.edu
Seamless Hardware/Software Performance Co-Monitoring in a Codesign Simulation Environment with RTOS Support
Seamless Hardware/Software Performance Co-Monitoring in a Codesign Simulation Environment with RTOS Support L. Moss 1, M. de Nanclas 1, L. Filion 1, S. Fontaine 1, G. Bois 1 and M. Aboulhamid 2 1 Department
Performance Monitoring of Parallel Scientific Applications
Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure
Operating System for the K computer
Operating System for the K computer Jun Moroo Masahiko Yamada Takeharu Kato For the K computer to achieve the world s highest performance, Fujitsu has worked on the following three performance improvements
Client/Server Computing Distributed Processing, Client/Server, and Clusters
Client/Server Computing Distributed Processing, Client/Server, and Clusters Chapter 13 Client machines are generally single-user PCs or workstations that provide a highly userfriendly interface to the
Linux tools for debugging and profiling MPI codes
Competence in High Performance Computing Linux tools for debugging and profiling MPI codes Werner Krotz-Vogel, Pallas GmbH MRCCS September 02000 Pallas GmbH Hermülheimer Straße 10 D-50321
Survey of execution monitoring tools for computer clusters
Survey of execution monitoring tools for computer clusters Espen S. Johnsen Otto J. Anshus John Markus Bjørndalen Lars Ailo Bongo Department of Computer Science, University of Tromsø September 29, 2003
Optimization tools. 1) Improving Overall I/O
Optimization tools After your code is compiled, debugged, and capable of running to completion or planned termination, you can begin looking for ways in which to improve execution speed. In general, the
Software: Systems and Application Software
Software: Systems and Application Software Computer Software Operating System Popular Operating Systems Language Translators Utility Programs Applications Programs Types of Application Software Personal
White Paper. Real-time Capabilities for Linux SGI REACT Real-Time for Linux
White Paper Real-time Capabilities for Linux SGI REACT Real-Time for Linux Abstract This white paper describes the real-time capabilities provided by SGI REACT Real-Time for Linux. software. REACT enables
Multi-Threading Performance on Commodity Multi-Core Processors
Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction
White Paper. How Streaming Data Analytics Enables Real-Time Decisions
White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream
- An Essential Building Block for Stable and Reliable Compute Clusters
Ferdinand Geier ParTec Cluster Competence Center GmbH, V. 1.4, March 2005 Cluster Middleware - An Essential Building Block for Stable and Reliable Compute Clusters Contents: Compute Clusters a Real Alternative
STUDY OF PERFORMANCE COUNTERS AND PROFILING TOOLS TO MONITOR PERFORMANCE OF APPLICATION
STUDY OF PERFORMANCE COUNTERS AND PROFILING TOOLS TO MONITOR PERFORMANCE OF APPLICATION 1 DIPAK PATIL, 2 PRASHANT KHARAT, 3 ANIL KUMAR GUPTA 1,2 Depatment of Information Technology, Walchand College of
Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach
Parallel Ray Tracing using MPI: A Dynamic Load-balancing Approach S. M. Ashraful Kadir 1 and Tazrian Khan 2 1 Scientific Computing, Royal Institute of Technology (KTH), Stockholm, Sweden [email protected],
Operating Systems 4 th Class
Operating Systems 4 th Class Lecture 1 Operating Systems Operating systems are essential part of any computer system. Therefore, a course in operating systems is an essential part of any computer science
Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture
Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts
Building an Inexpensive Parallel Computer
Res. Lett. Inf. Math. Sci., (2000) 1, 113-118 Available online at http://www.massey.ac.nz/~wwiims/rlims/ Building an Inexpensive Parallel Computer Lutz Grosz and Andre Barczak I.I.M.S., Massey University
A Real Time, Object Oriented Fieldbus Management System
A Real Time, Object Oriented Fieldbus Management System Mr. Ole Cramer Nielsen Managing Director PROCES-DATA Supervisor International P-NET User Organisation Navervej 8 8600 Silkeborg Denmark [email protected]
Chapter 18: Database System Architectures. Centralized Systems
Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and
Improved metrics collection and correlation for the CERN cloud storage test framework
Improved metrics collection and correlation for the CERN cloud storage test framework September 2013 Author: Carolina Lindqvist Supervisors: Maitane Zotes Seppo Heikkila CERN openlab Summer Student Report
Design Issues in a Bare PC Web Server
Design Issues in a Bare PC Web Server Long He, Ramesh K. Karne, Alexander L. Wijesinha, Sandeep Girumala, and Gholam H. Khaksari Department of Computer & Information Sciences, Towson University, 78 York
Code Coverage Testing Using Hardware Performance Monitoring Support
Code Coverage Testing Using Hardware Performance Monitoring Support Alex Shye Matthew Iyer Vijay Janapa Reddi Daniel A. Connors Department of Electrical and Computer Engineering University of Colorado
Symmetric Multiprocessing
Multicore Computing A multi-core processor is a processing system composed of two or more independent cores. One can describe it as an integrated circuit to which two or more individual processors (called
IBM RATIONAL PERFORMANCE TESTER
IBM RATIONAL PERFORMANCE TESTER Today, a major portion of newly developed enterprise applications is based on Internet connectivity of a geographically distributed work force that all need on-line access
Chapter 3 Operating-System Structures
Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual
Perfmon2: A leap forward in Performance Monitoring
Perfmon2: A leap forward in Performance Monitoring Sverre Jarp, Ryszard Jurga, Andrzej Nowak CERN, Geneva, Switzerland [email protected] Abstract. This paper describes the software component, perfmon2,
theguard! ApplicationManager System Windows Data Collector
theguard! ApplicationManager System Windows Data Collector Status: 10/9/2008 Introduction... 3 The Performance Features of the ApplicationManager Data Collector for Microsoft Windows Server... 3 Overview
HPC Wales Skills Academy Course Catalogue 2015
HPC Wales Skills Academy Course Catalogue 2015 Overview The HPC Wales Skills Academy provides a variety of courses and workshops aimed at building skills in High Performance Computing (HPC). Our courses
Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation
Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation Satish Narayanasamy, Cristiano Pereira, Harish Patil, Robert Cohn, and Brad Calder Computer Science and
CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler
CE 504 Computational Hydrology Computational Environments and Tools Fritz R. Fiedler 1) Operating systems a) Windows b) Unix and Linux c) Macintosh 2) Data manipulation tools a) Text Editors b) Spreadsheets
Chapter 2 System Structures
Chapter 2 System Structures Operating-System Structures Goals: Provide a way to understand an operating systems Services Interface System Components The type of system desired is the basis for choices
International Workshop on Field Programmable Logic and Applications, FPL '99
International Workshop on Field Programmable Logic and Applications, FPL '99 DRIVE: An Interpretive Simulation and Visualization Environment for Dynamically Reconægurable Systems? Kiran Bondalapati and
D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Version 1.0
D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Document Information Contract Number 288777 Project Website www.montblanc-project.eu Contractual Deadline
GSiB: PSE Infrastructure for Dynamic Service-oriented Grid Applications
GSiB: PSE Infrastructure for Dynamic Service-oriented Grid Applications Yan Huang Department of Computer Science Cardiff University PO Box 916 Cardiff CF24 3XF United Kingdom [email protected]
Embedded OS. Product Information
Product Information Table of Contents 1 Operating Systems for ECUs... 3 2 MICROSAR.OS The Real-Time Operating System for the AUTOSAR Standard... 3 2.1 Overview of Advantages... 3 2.2 Properties... 4 2.3
PDM: Programmable Monitoring For Distributed Applications
PDM: Programmable Monitoring For Distributed Applications Michael D. Rogers James E. Lumpp Jr. James Griffioen Department of Computer Science Department of Electrical Engineering University of Kentucky
Sequential Performance Analysis with Callgrind and KCachegrind
Sequential Performance Analysis with Callgrind and KCachegrind 4 th Parallel Tools Workshop, HLRS, Stuttgart, September 7/8, 2010 Josef Weidendorfer Lehrstuhl für Rechnertechnik und Rechnerorganisation
Real Time Programming: Concepts
Real Time Programming: Concepts Radek Pelánek Plan at first we will study basic concepts related to real time programming then we will have a look at specific programming languages and study how they realize
The Role of Precise Timing in High-Speed, Low-Latency Trading
The Role of Precise Timing in High-Speed, Low-Latency Trading The race to zero nanoseconds Whether measuring network latency or comparing real-time trading data from different computers on the planet,
Virtual Machines. www.viplavkambli.com
1 Virtual Machines A virtual machine (VM) is a "completely isolated guest operating system installation within a normal host operating system". Modern virtual machines are implemented with either software
