STUDY OF PERFORMANCE COUNTERS AND PROFILING TOOLS TO MONITOR PERFORMANCE OF APPLICATION

Size: px
Start display at page:

Download "STUDY OF PERFORMANCE COUNTERS AND PROFILING TOOLS TO MONITOR PERFORMANCE OF APPLICATION"

Transcription

1 STUDY OF PERFORMANCE COUNTERS AND PROFILING TOOLS TO MONITOR PERFORMANCE OF APPLICATION 1 DIPAK PATIL, 2 PRASHANT KHARAT, 3 ANIL KUMAR GUPTA 1,2 Depatment of Information Technology, Walchand College of Engineering, Sangli (MH), India. 3 Joint Director, CDAC, Pune (MH), India. Abstract- Monitoring the performance of the application is an important task to make them effective and efficient. That s why modern processor incorporated a new feature called PMU. It used for monitoring performance of application by maintaining performance counters. The various performance profiling tools are developed to monitor various events and extract there count from performance counter. This paper is to study some tool such as PerfCtr, Perf, PAPI, Intel EC SDK and there mechanism and usage. After that we will find the appropriate tool for profiling the counter from PMU. Keywords- PMU, PMC, Performance Event, MSR. I. INTRODUCTION Now a day s application development with high performance and low power consumption is a hot topic and to achieve this, tuning of application with low the level assessment of it is become important. Now it s possible by monitoring the performance event which occur during the execution of the application [1][2]. Modern Processor are generally comes with Performance Monitoring Unit (PMU) [3], which is supported in Intel architecture from Pentium processor. The PMU consist of performance counter to count event occurred in processor and system. To work with PMU model provided in processor there are some tools are developed such as perfctr, perf, Performance API (PAPI), Intel EC SDK. Perfctr is a open source tool comes in Linux package [4] having patch to the kernel and drivers to evaluate the performance parameters of an application. Perf [5] is another profiling tool for Linux-2.6 and onwards versions. It is a simple command line tool that gives access to the performance parameters of application, which is based on the perf_event interface[6] exported by the recent version of the Linux kernel. PAPI [7] is on tool using higher level API to set up and access performance counter and measure the performance event. Intel Energy Checker SDK (Intel EC SDK) is the tool developed by Intel with intention to develop energy efficient applications [8]. This paper s section II will study PMU architecture for Intel core micro-architecture, section III discuss tools internal working mechanism and IV section will see supported Intel x86 architectures by this tools. II. PMU ARCHITECTURE The PMU is part of processor, including Performance Monitoring counter (PMC) and some Model Specific Register (MSR) to configure the PMC. PMC is a counter, which holding count the occurrences of event. Here we will see Intel core micro-architecture for performance monitoring, consists of two general purpose counters and three fixed function counter [3]. Table I Performance Monitoring Counters General-Purpose Fixed-Function PMC PMC IA32_PMC0 IA32_PMC1 IA32_FIXED_CTR0 IA32_FIXED_CTR1 IA32_FIXED_CTR2 The PMU have following MSR to control and program or configure, to get status and handle overflow of PMCs. IA32_PERFEVTSELx Configuration of the General -Purpose PMC is done by writing to bit fields into their respective MSR. IA32_FIXED_CTR_CTRL Configuration of the fixed-function PMCs is done by writing to bit fields in this MSR. Most frequent operations in programming performance events are enabling or disabling event counting and checking the status of counter overflows, which is globally done by following MSR. IA32_PERF_GLOBAL_CTRL With this MSR can enable/disable event counting of all or any combination of fixed-function PMCs or any general-purpose PMCs. IA32_PERF_GLOBAL_STATUS This MSR allows to query counter overflow conditions on any combination of fixed-function PMCs or general-purpose PMCs. IA32_PERF_GLOBAL_OVF_CTRL This MSR allows software to clear counter overflow conditions on any combination of fixed-function PMCs or general-purpose PMCs. 45

2 III. WORKING OF TOOLS Study of Performance Counters and Profiling Tools to Monitor Performance of Application A. Perfctr Perfctr is an open source tool used for profiling an application by accessing performance counters. A Linux package Perfctr 2.x [4] consists of driver and a patch for kernel. Perfctr patch modifies the process to support per process counter which is used to profiling hardware counter. This tool uses driver which makes possible to program and read values from performance monitoring unit found in every modern processor. The mechanism used by perfctr is, every Linux process maintains its own set of Virtual PMCs, which are mapped with processor hardware PMCs. This Virtual PMCs are private to each process. Each process also has a virtual Time-Stamp Counter (TSC). The virtual PMCs are of 64 bit precision, where processors incorporate 40 or 48 bit PMCs. A process accesses its virtual PMCs by opening /proc/self/perfctr and issuing system calls on the resulting file descriptor. A user-space library given with the package provides a more high-level interface. The driver also supports global-mode or system-wide PMCs. In this mode, each PMC on each processor can be controlled and read. The PMCs and TSCs on active processors are sampled periodically and the accumulated sums have 64-bit precision. Global-mode PMCs access via the /dev/perfctr device file; the userspace library provides a more high-level interface. Perfctr package has following two parts first is patch to kernel and another is driver to access PMC. i. Patch to kernel: PMCs are general purpose registers which are part of PMU holding count of event when process are executing. One way to use this PMC registers to count event per process basis by making modifications in process structure. This patch modifies per process data structure and routines used for context switching, to support the PMCs in process and hold the value of counter. emulates a device /dev/perfctr to which users can issue ioctls to obtain values of various PMCs. It defines function mapping to ioctl in its file operation structure. List of ioctls that can be sent to this device and their corresponding functions are called. PERFCTR_INFO ioctl returns a structure which gives information on various counters to the user using copy_to_user function. GPERFCTR_CONTROL ioctl makes the driver to allocate various perfctr structures and start a timer which will be used to sample the values at periodic intervals. GPERFCTR_READ ioctl returns a perfctr structure with the updated values of the PMC s. GPERFCTR_STOP ioctls releases the timer and resets various PMC s to their previous values. B. Perf Perf is an open source command line tool provided by Linux, which is used for performance monitoring of applications. It is available from Linux kernel version which is allows measure performance parameters of application with PMU. It is required to program the PMU for measuring performance parameters, and retrieve counters value and information. Linux interface named perf_event, is used for this purpose. Linux Perf has files core.c and Perf_event.c. These files provide an interface between the Linux kernel and user space performance monitoring tools shown in fig.1. Changes are done in process specific files to support per-process virtual performance counter and handle their context switch. At every process switch hardware context of the process is being replaced must be saved somewhere. Thus process descriptor is modified to save the hardware counter value in process context when it is switched out. The process handling routine is also modified to call the virtual per-process counter driver routines which save and restore the PMCs values. ii. Driver to provide PMC access: Perfctr driver provides /proc interface. By opening the file /proc/self/perfctr, processes can get perfctr structure containing PMC values. PMC driver Figure 1. Architecture of Perf i. Perf_event Perf_event is a Linux Subsystem. When the Linux kernel is loaded the corresponding perf modules are statically loaded. It assigns the file descriptor for each event and thread or process. It provides a file descriptor by using perf_event_open() [5] where we mention the event to which it should be assigned. It configure the hardware PMCs for events to be monitor. It returns file descriptor, which is used to access performance counter value. Perf_event has various features [9] such as providing generalized events available on most of the modern processors, 46

3 event scheduling, multiplexing to measure count of more events at than number of counter supported by the hardware. It also provides software events. int perf_event_open(struct perf_event_attr *attr,pid_t pid, int cpu, int group_fd,unsigned long flags) cpu field specify that cpu on which event monitor. pid field specify process of which event monitor. attr field is detail information of event to be monitor. group_fd field is useful for group of number of event. C. Intel EC SDK Intel EC SDK [8] is a tool for estimation of energy and power consumption by the platform when the application is executing on it. Intel build this tool with the goal of develop of an energy efficient application by the analysis of energy and power consumption of the application. This tool is beneficial for evaluating the impact of change of hardware, hardware setting and software algorithm and library on energy consumption of application. Each file descriptor corresponds to one event that is measured; these can be grouped together to measure multiple events simultaneously. Events can be enabled and disabled either via ioctl or via prctl. int read(int fd,char *buf,size_t size) By using a file descriptor on the read system call can access the counter value. Internally Perf_event calls various functions to interact with PMU module and read values from the hardware. When perf_event is loaded, it registers a Nonmaskable interrupt handler. It generates interrupt when counter overflow occurs. The interrupt handler saves the value of the registers and the counters are reset to predefined values. The perf_event subsystem invokes some functions of the Linux scheduler, To measuring per-thread or perprocess performance. At every context switch the context of current events has been pushed on the task_struct structure. Once the context switch is over the events attached to the newly scheduled process are accessed via the current macro in Linux which point to the currently running process. ii. Libpfm4 The perf command provides a subset of common performance counter events to measure such as processor clock cycles, instructions counts, and cache event metrics. However, most processors provide many other implementation specific hardware events such a floating point operations and micro architecture events (such as stalls due to hardware resource limits). To access those implementation specific events one needs to use the raw event in perf_event which can be tedious. Libpfm4 [10] provides a mapping mechanism to refer to those implementation specific hardware events by name. This library is used in conjunction with perf_events Linux API. Encoding event for perf_event done by int pfm_get_event_encoding(const char *str,int dfl_plm, struct perf_event_att *attr,char **fstar,int *idx) str is event string to encode dfl_plm is privilege level mask attr is perf_event specific event data structure fill out by this function. This is also used to count events by inserting API in application source code. It uses sequence of logical counters called it productivity link. Counter is one that stores the number of times event or process occurred. This tool allows import and export of counter to/from application. It imports and exports counter through productivity link (PL). Application uses PL to import/export counter with following steps. i. Create PL. ii. Specify counter to be created and maintain in the PL. iii. Use PL for Import and Export counter to/ from the application. iv. Close PL. Fig.2. Using Intel EC to import and export counters The component of Intel EC SDK as follows: i. Intel Core API ii. Interpretability tool iii. Energy and Temperature Monitoring tool iv. Scripting tool v. SDK companion application D. PAPI PAPI is very famous and commonly used tool for retrieving the performance of the application. This tool provide a consistent interface and methodology to use performance counter found in modern microprocessors. The Goal to develop PAPI [7] tool is to provide an easy to use, common set of interfaces that will gain access to these performance counters on all major processor platforms, thereby providing application developer information to performance analysis, modeling, and tuning of application. PAPI consist of two interface provided to hardware counter. The high level interface provide simple access to counter and make the counter start, stop, 47

4 reading simple for specified number of performance event. The low level interface manages hardware events in user defined groups called EventSets. PAPI includes a predefined set of events meant to represent a lowest common denominator of a good counter implementation, the intent being that the same tool would count similar and possibly comparable events when run on different platforms. If the programmer chooses to use this set of standardized events, then the source code need not be changed and only a recompile is necessary. In addition to provide access to counter this tool also handle condition of counter overflow. It provides the sophisticated functionality of user callbacks on counter overflow and hardware based SVR4 compatible profiling, regardless of whether or not the operating system supports it. Nehalem/ Yes Yes Yes Yes Nehalem EX / Westmer e Westmer Yes No Yes Yes e Ex Sandy Yes No Yes Yes Bridge Sandy Yes No Yes Yes BridgeEP Ivy Yes No Yes Yes Bridge Ivy Yes No Yes Yes BridgeEP Hashwell Yes No Yes Yes / Hashwell EP Brodwell No No Yes Yes Knight corner Yes No Yes Yes CONCLUSION Fig.3. Architecture of PAPI IV. SUPPORTED ARCHITECTURES The tools studied above have compatibility to some specific micro-architectures. Table II shows micro architecture from x86 architecture supported for these tools. Table II Supported x86 architectures Name PA Perfe Perf_e Libpfm4 PI ct vent Pentium No Yes No No Pentium Yes Yes Yes Yes Pro/II/III/ M/4/D Core Duo Yes Yes Yes Yes Core 2 Yes Yes Yes Yes Atom Yes Yes Yes Yes Atom Yes No Yes Yes Cedarvie w Atom Silvermo n Yes No Yes Yes Among these tools perfctr, perf, PAPI are open source tools, where as PAPI, perf are more user friendly, i.e. easy to set up, easy to use for profiling of Performance counter. Perfctr is limited to some lower kernel versions which are not mostly used in today s systems. The Intel EC SDK is not best suited for accessing performance counters as compared to the others. However, it is used to measure energy consumption of system with external power meters. Perf is a command line tool for accessing performance counters on Linux platform. It builds on the perf_event Linux interface upon which different performance monitoring tools were developed. PAPI is another best alternative to measure performance by directly accessing hardware counters. PAPI and perf tool work with large number of micro-architecture. We can mark that perf and PAPI are most appropriate tools to work with performance counter and performance profiling. REFERENCES [1] W. Lloyd Bircher and Lizy K. John, Complete System Power Estimation Using Processor Performance Events, IEEE TRANSACTIONS ON COMPUTERS, VOL. 61, NO. 4, APRIL [2] Rance Rodrigues, Arunachalam Annamalai, Israel Koren, and Sandip Kundu, A Study on the Use of Performance Counters to Estimate Power in Microprocessors, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II: EXPRESS BRIEFS, VOL. 60, NO. 12, DECEMBER [3] Intel 64 and IA-32 Architectures Software Developer s Manual Volume 3B: System Programming Guide, Part 2 developer.intel.com. [4] Source Code of perfctr patch by Mikael Pattersson 48

5 [5] Linux manual page for perf_event_open rf_event_open.html [6] Aman Singh and Anup Buchke, A Study of Performance Monitoring Unit, perf and perf_events subsystem. [7] S. Browne, J Dongarra, N. Garner, K. London, P. Mucci, A Portable Programming Interface for Performance Evaluation on Modern Processors. [8] Intel Energy Checker SDK Code, Resource and Documentation [9] Vincent M. Weaver, Linux perf_event features and overhead. [10] Libpfm4 manual page 49

A Study of Performance Monitoring Unit, perf and perf_events subsystem

A Study of Performance Monitoring Unit, perf and perf_events subsystem A Study of Performance Monitoring Unit, perf and perf_events subsystem Team Aman Singh Anup Buchke Mentor Dr. Yann-Hang Lee Summary Performance Monitoring Unit, or the PMU, is found in all high end processors

More information

Performance Counter. Non-Uniform Memory Access Seminar Karsten Tausche 2014-12-10

Performance Counter. Non-Uniform Memory Access Seminar Karsten Tausche 2014-12-10 Performance Counter Non-Uniform Memory Access Seminar Karsten Tausche 2014-12-10 Performance Counter Hardware Unit for event measurements Performance Monitoring Unit (PMU) Originally for CPU-Debugging

More information

Using PAPI for hardware performance monitoring on Linux systems

Using PAPI for hardware performance monitoring on Linux systems Using PAPI for hardware performance monitoring on Linux systems Jack Dongarra, Kevin London, Shirley Moore, Phil Mucci, and Dan Terpstra Innovative Computing Laboratory, University of Tennessee, Knoxville,

More information

D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Version 1.0

D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Version 1.0 D5.6 Prototype demonstration of performance monitoring tools on a system with multiple ARM boards Document Information Contract Number 288777 Project Website www.montblanc-project.eu Contractual Deadline

More information

Perfmon2: a flexible performance monitoring interface for Linux

Perfmon2: a flexible performance monitoring interface for Linux Perfmon2: a flexible performance monitoring interface for Linux Stéphane Eranian HP Labs [email protected] Abstract Monitoring program execution is becoming more than ever key to achieving world-class

More information

A Brief Survery of Linux Performance Engineering. Philip J. Mucci University of Tennessee, Knoxville [email protected]

A Brief Survery of Linux Performance Engineering. Philip J. Mucci University of Tennessee, Knoxville mucci@pdc.kth.se A Brief Survery of Linux Performance Engineering Philip J. Mucci University of Tennessee, Knoxville [email protected] Overview On chip Hardware Performance Counters Linux Performance Counter Infrastructure

More information

The Intel VTune Performance Analyzer

The Intel VTune Performance Analyzer The Intel VTune Performance Analyzer Focusing on Vtune for Intel Itanium running Linux* OS Copyright 2002 Intel Corporation. All rights reserved. VTune and the Intel logo are trademarks or registered trademarks

More information

Basics of VTune Performance Analyzer. Intel Software College. Objectives. VTune Performance Analyzer. Agenda

Basics of VTune Performance Analyzer. Intel Software College. Objectives. VTune Performance Analyzer. Agenda Objectives At the completion of this module, you will be able to: Understand the intended purpose and usage models supported by the VTune Performance Analyzer. Identify hotspots by drilling down through

More information

Perfmon2: A leap forward in Performance Monitoring

Perfmon2: A leap forward in Performance Monitoring Perfmon2: A leap forward in Performance Monitoring Sverre Jarp, Ryszard Jurga, Andrzej Nowak CERN, Geneva, Switzerland [email protected] Abstract. This paper describes the software component, perfmon2,

More information

Hardware performance monitoring. Zoltán Majó

Hardware performance monitoring. Zoltán Majó Hardware performance monitoring Zoltán Majó 1 Question Did you take any of these lectures: Computer Architecture and System Programming How to Write Fast Numerical Code Design of Parallel and High Performance

More information

Perf Tool: Performance Analysis Tool for Linux

Perf Tool: Performance Analysis Tool for Linux / Notes on Linux perf tool Intended audience: Those who would like to learn more about Linux perf performance analysis and profiling tool. Used: CPE 631 Advanced Computer Systems and Architectures CPE

More information

An Implementation Of Multiprocessor Linux

An Implementation Of Multiprocessor Linux An Implementation Of Multiprocessor Linux This document describes the implementation of a simple SMP Linux kernel extension and how to use this to develop SMP Linux kernels for architectures other than

More information

perfmon2: a flexible performance monitoring interface for Linux perfmon2: une interface flexible pour l'analyse de performance sous Linux

perfmon2: a flexible performance monitoring interface for Linux perfmon2: une interface flexible pour l'analyse de performance sous Linux perfmon2: a flexible performance monitoring interface for Linux perfmon2: une interface flexible pour l'analyse de performance sous Linux Stéphane Eranian HP Labs July 2006 Ottawa Linux Symposium 2006

More information

Freescale Semiconductor, I

Freescale Semiconductor, I nc. Application Note 6/2002 8-Bit Software Development Kit By Jiri Ryba Introduction 8-Bit SDK Overview This application note describes the features and advantages of the 8-bit SDK (software development

More information

Perfmon2: a standard performance monitoring interface for Linux. Stéphane Eranian <[email protected]>

Perfmon2: a standard performance monitoring interface for Linux. Stéphane Eranian <eranian@gmail.com> Perfmon2: a standard performance monitoring interface for Linux Stéphane Eranian Agenda PMU-based performance monitoring Overview of the interface Current status Tools Challenges 2

More information

Building an energy dashboard. Energy measurement and visualization in current HPC systems

Building an energy dashboard. Energy measurement and visualization in current HPC systems Building an energy dashboard Energy measurement and visualization in current HPC systems Thomas Geenen 1/58 [email protected] SURFsara The Dutch national HPC center 2H 2014 > 1PFlop GPGPU accelerators

More information

Performance monitoring with Intel Architecture

Performance monitoring with Intel Architecture Performance monitoring with Intel Architecture CSCE 351: Operating System Kernels Lecture 5.2 Why performance monitoring? Fine-tune software Book-keeping Locating bottlenecks Explore potential problems

More information

An OS-oriented performance monitoring tool for multicore systems

An OS-oriented performance monitoring tool for multicore systems An OS-oriented performance monitoring tool for multicore systems J.C. Sáez, J. Casas, A. Serrano, R. Rodríguez-Rodríguez, F. Castro, D. Chaver, M. Prieto-Matias Department of Computer Architecture Complutense

More information

Self-monitoring Overhead of the Linux perf event Performance Counter Interface

Self-monitoring Overhead of the Linux perf event Performance Counter Interface Paper Appears in ISPASS 215, IEEE Copyright Rules Apply Self-monitoring Overhead of the Linux perf event Performance Counter Interface Vincent M. Weaver Electrical and Computer Engineering University of

More information

Intel 64 and IA-32 Architectures Software Developer s Manual

Intel 64 and IA-32 Architectures Software Developer s Manual Intel 64 and IA-32 Architectures Software Developer s Manual Volume 3B: System Programming Guide, Part 2 NOTE: The Intel 64 and IA-32 Architectures Software Developer's Manual consists of eight volumes:

More information

Data Structure Oriented Monitoring for OpenMP Programs

Data Structure Oriented Monitoring for OpenMP Programs A Data Structure Oriented Monitoring Environment for Fortran OpenMP Programs Edmond Kereku, Tianchao Li, Michael Gerndt, and Josef Weidendorfer Institut für Informatik, Technische Universität München,

More information

CPU performance monitoring using the Time-Stamp Counter register

CPU performance monitoring using the Time-Stamp Counter register CPU performance monitoring using the Time-Stamp Counter register This laboratory work introduces basic information on the Time-Stamp Counter CPU register, which is used for performance monitoring. The

More information

Hardware Performance Monitoring with PAPI

Hardware Performance Monitoring with PAPI Hardware Performance Monitoring with PAPI Dan Terpstra [email protected] Workshop July 2007 What s s PAPI? Middleware that provides a consistent programming interface for the performance counter hardware

More information

Intel Xeon Phi Coprocessor (codename: Knights Corner) Performance Monitoring Units

Intel Xeon Phi Coprocessor (codename: Knights Corner) Performance Monitoring Units Intel Xeon Phi Coprocessor (codename: Knights Corner) Performance Monitoring Units Revision: 1.01 Last Modified: July 10, 2012 Document Number: 327357-001 Page 1 INFORMATION IN THIS DOCUMENT IS PROVIDED

More information

A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters

A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters S. Browne, J. Dongarra +, N. Garner, K. London, and P. Mucci Introduction For years collecting performance

More information

Performance Monitoring of the Software Frameworks for LHC Experiments

Performance Monitoring of the Software Frameworks for LHC Experiments Proceedings of the First EELA-2 Conference R. mayo et al. (Eds.) CIEMAT 2009 2009 The authors. All rights reserved Performance Monitoring of the Software Frameworks for LHC Experiments William A. Romero

More information

Full and Para Virtualization

Full and Para Virtualization Full and Para Virtualization Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF x86 Hardware Virtualization The x86 architecture offers four levels

More information

Performance Monitor on PowerQUICC II Pro Processors

Performance Monitor on PowerQUICC II Pro Processors Freescale Semiconductor Application Note Document Number: AN3359 Rev. 0, 05/2007 Performance Monitor on PowerQUICC II Pro Processors by Harinder Rai Network Computing Systems Group Freescale Semiconductor,

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

Performance Application Programming Interface

Performance Application Programming Interface /************************************************************************************ ** Notes on Performance Application Programming Interface ** ** Intended audience: Those who would like to learn more

More information

On the Importance of Thread Placement on Multicore Architectures

On the Importance of Thread Placement on Multicore Architectures On the Importance of Thread Placement on Multicore Architectures HPCLatAm 2011 Keynote Cordoba, Argentina August 31, 2011 Tobias Klug Motivation: Many possibilities can lead to non-deterministic runtimes...

More information

White Paper. Real-time Capabilities for Linux SGI REACT Real-Time for Linux

White Paper. Real-time Capabilities for Linux SGI REACT Real-Time for Linux White Paper Real-time Capabilities for Linux SGI REACT Real-Time for Linux Abstract This white paper describes the real-time capabilities provided by SGI REACT Real-Time for Linux. software. REACT enables

More information

FRONT FLYLEAF PAGE. This page has been intentionally left blank

FRONT FLYLEAF PAGE. This page has been intentionally left blank FRONT FLYLEAF PAGE This page has been intentionally left blank Abstract The research performed under this publication will combine virtualization technology with current kernel debugging techniques to

More information

Agenda. Context. System Power Management Issues. Power Capping Overview. Power capping participants. Recommendations

Agenda. Context. System Power Management Issues. Power Capping Overview. Power capping participants. Recommendations Power Capping Linux Agenda Context System Power Management Issues Power Capping Overview Power capping participants Recommendations Introduction of Linux Power Capping Framework 2 Power Hungry World Worldwide,

More information

Chapter 3 Operating-System Structures

Chapter 3 Operating-System Structures Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual

More information

PAPI - PERFORMANCE API. ANDRÉ PEREIRA [email protected]

PAPI - PERFORMANCE API. ANDRÉ PEREIRA ampereira@di.uminho.pt 1 PAPI - PERFORMANCE API ANDRÉ PEREIRA [email protected] 2 Motivation Application and functions execution time is easy to measure time gprof valgrind (callgrind) It is enough to identify bottlenecks,

More information

VxWorks Guest OS Programmer's Guide for Hypervisor 1.1, 6.8. VxWorks GUEST OS PROGRAMMER'S GUIDE FOR HYPERVISOR 1.1 6.8

VxWorks Guest OS Programmer's Guide for Hypervisor 1.1, 6.8. VxWorks GUEST OS PROGRAMMER'S GUIDE FOR HYPERVISOR 1.1 6.8 VxWorks Guest OS Programmer's Guide for Hypervisor 1.1, 6.8 VxWorks GUEST OS PROGRAMMER'S GUIDE FOR HYPERVISOR 1.1 6.8 Copyright 2009 Wind River Systems, Inc. All rights reserved. No part of this publication

More information

x86 ISA Modifications to support Virtual Machines

x86 ISA Modifications to support Virtual Machines x86 ISA Modifications to support Virtual Machines Douglas Beal Ashish Kumar Gupta CSE 548 Project Outline of the talk Review of Virtual Machines What complicates Virtualization Technique for Virtualization

More information

PAPI-V: Performance Monitoring for Virtual Machines

PAPI-V: Performance Monitoring for Virtual Machines PAPI-V: Performance Monitoring for Virtual Machines Matt Johnson, Heike McCraw, Shirley Moore, Phil Mucci, John Nelson, Dan Terpstra, Vince Weaver Electrical Engineering and Computer Science Dept. University

More information

RCL: Design and Open Specification

RCL: Design and Open Specification ICT FP7-609828 RCL: Design and Open Specification D3.1.1 March 2014 _D3.1.1_RCLDesignAndOpenSpecification_v1.0 Document Information Scheduled delivery Actual delivery Version Responsible Partner 31.03.2014

More information

Windows8 Internals, Sixth Edition, Part 1

Windows8 Internals, Sixth Edition, Part 1 Microsoft Windows8 Internals, Sixth Edition, Part 1 Mark Russinovich David A. Solomon Alex lonescu Windows Internals, Sixth Edition, Part i Introduction xvii Chapter 1 Concepts and Tools 1 Windows Operating

More information

Multi-Threading Performance on Commodity Multi-Core Processors

Multi-Threading Performance on Commodity Multi-Core Processors Multi-Threading Performance on Commodity Multi-Core Processors Jie Chen and William Watson III Scientific Computing Group Jefferson Lab 12000 Jefferson Ave. Newport News, VA 23606 Organization Introduction

More information

Performance Profiling in a Virtualized Environment

Performance Profiling in a Virtualized Environment Performance Profiling in a Virtualized Environment Jiaqing Du EPFL, Switzerland Nipun Sehrawat IIT Guwahati, India Willy Zwaenepoel EPFL, Switzerland Abstract Virtualization is a key enabling technology

More information

Republic Polytechnic School of Information and Communications Technology C226 Operating System Concepts. Module Curriculum

Republic Polytechnic School of Information and Communications Technology C226 Operating System Concepts. Module Curriculum Republic Polytechnic School of Information and Communications Technology C6 Operating System Concepts Module Curriculum Module Description: This module examines the fundamental components of single computer

More information

Hardware-based performance monitoring with VTune Performance Analyzer under Linux

Hardware-based performance monitoring with VTune Performance Analyzer under Linux Hardware-based performance monitoring with VTune Performance Analyzer under Linux Hassan Shojania [email protected] Abstract All new modern processors have hardware support for monitoring processor performance.

More information

Intel Application Software Development Tool Suite 2.2 for Intel Atom processor. In-Depth

Intel Application Software Development Tool Suite 2.2 for Intel Atom processor. In-Depth Application Software Development Tool Suite 2.2 for Atom processor In-Depth Contents Application Software Development Tool Suite 2.2 for Atom processor............................... 3 Features and Benefits...................................

More information

End-user Tools for Application Performance Analysis Using Hardware Counters

End-user Tools for Application Performance Analysis Using Hardware Counters 1 End-user Tools for Application Performance Analysis Using Hardware Counters K. London, J. Dongarra, S. Moore, P. Mucci, K. Seymour, T. Spencer Abstract One purpose of the end-user tools described in

More information

Intel Power Gadget 2.0 Monitoring Processor Energy Usage

Intel Power Gadget 2.0 Monitoring Processor Energy Usage Intel Power Gadget 2.0 Monitoring Processor Energy Usage Introduction Intel Power Gadget 2.0 is enabled for 2nd generation Intel Core Processor based platforms is a set of Microsoft Windows* gadget, driver,

More information

Audit Trail Administration

Audit Trail Administration Audit Trail Administration 0890431-030 August 2003 Copyright 2003 by Concurrent Computer Corporation. All rights reserved. This publication or any part thereof is intended for use with Concurrent Computer

More information

Operating Systems. Lecture 03. February 11, 2013

Operating Systems. Lecture 03. February 11, 2013 Operating Systems Lecture 03 February 11, 2013 Goals for Today Interrupts, traps and signals Hardware Protection System Calls Interrupts, Traps, and Signals The occurrence of an event is usually signaled

More information

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study

CS 377: Operating Systems. Outline. A review of what you ve learned, and how it applies to a real operating system. Lecture 25 - Linux Case Study CS 377: Operating Systems Lecture 25 - Linux Case Study Guest Lecturer: Tim Wood Outline Linux History Design Principles System Overview Process Scheduling Memory Management File Systems A review of what

More information

Libmonitor: A Tool for First-Party Monitoring

Libmonitor: A Tool for First-Party Monitoring Libmonitor: A Tool for First-Party Monitoring Mark W. Krentel Dept. of Computer Science Rice University 6100 Main St., Houston, TX 77005 [email protected] ABSTRACT Libmonitor is a library that provides

More information

Page 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems

Page 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems Lecture Outline Operating Systems Objectives Describe the functions and layers of an operating system List the resources allocated by the operating system and describe the allocation process Explain how

More information

ELEC 377. Operating Systems. Week 1 Class 3

ELEC 377. Operating Systems. Week 1 Class 3 Operating Systems Week 1 Class 3 Last Class! Computer System Structure, Controllers! Interrupts & Traps! I/O structure and device queues.! Storage Structure & Caching! Hardware Protection! Dual Mode Operation

More information

Iotivity Programmer s Guide Soft Sensor Manager for Android

Iotivity Programmer s Guide Soft Sensor Manager for Android Iotivity Programmer s Guide Soft Sensor Manager for Android 1 CONTENTS 2 Introduction... 3 3 Terminology... 3 3.1 Physical Sensor Application... 3 3.2 Soft Sensor (= Logical Sensor, Virtual Sensor)...

More information

Intel s SL Enhanced Intel486(TM) Microprocessor Family

Intel s SL Enhanced Intel486(TM) Microprocessor Family Intel s SL Enhanced Intel486(TM) Microprocessor Family June 1993 Intel's SL Enhanced Intel486 Microprocessor Family Technical Backgrounder Intel's SL Enhanced Intel486 Microprocessor Family With the announcement

More information

Embedded Programming in C/C++: Lesson-1: Programming Elements and Programming in C

Embedded Programming in C/C++: Lesson-1: Programming Elements and Programming in C Embedded Programming in C/C++: Lesson-1: Programming Elements and Programming in C 1 An essential part of any embedded system design Programming 2 Programming in Assembly or HLL Processor and memory-sensitive

More information

A High Resolution Performance Monitoring Software on the Pentium

A High Resolution Performance Monitoring Software on the Pentium A High Resolution Performance Monitoring Software on the Pentium Ong Cheng Soon*, Fadhli Wong Mohd Hasan Wong**, Lai Weng Kin* * Software Lab, MIMOS Berhad [email protected], [email protected] ** Dept of Electrical

More information

Long-term monitoring of apparent latency in PREEMPT RT Linux real-time systems

Long-term monitoring of apparent latency in PREEMPT RT Linux real-time systems Long-term monitoring of apparent latency in PREEMPT RT Linux real-time systems Carsten Emde Open Source Automation Development Lab (OSADL) eg Aichhalder Str. 39, 78713 Schramberg, Germany [email protected]

More information

A Study on Performance Monitoring Counters in x86-architecture

A Study on Performance Monitoring Counters in x86-architecture A Study on Performance Monitoring Counters in x86-architecture Shibdas Bandyopadhyay Roll No. MTC0414 M.Tech CS 1 st Year Indian Statistical Institute 1 Table of Contents 1. Introduction... 3 2. Hardware

More information

Lesson-16: Real time clock DEVICES AND COMMUNICATION BUSES FOR DEVICES NETWORK

Lesson-16: Real time clock DEVICES AND COMMUNICATION BUSES FOR DEVICES NETWORK DEVICES AND COMMUNICATION BUSES FOR DEVICES NETWORK Lesson-16: Real time clock 1 Real Time Clock (RTC) A clock, which is based on the interrupts at preset intervals. An interrupt service routine executes

More information

Chapter 6, The Operating System Machine Level

Chapter 6, The Operating System Machine Level Chapter 6, The Operating System Machine Level 6.1 Virtual Memory 6.2 Virtual I/O Instructions 6.3 Virtual Instructions For Parallel Processing 6.4 Example Operating Systems 6.5 Summary Virtual Memory General

More information

Hardware Assisted Virtualization

Hardware Assisted Virtualization Hardware Assisted Virtualization G. Lettieri 21 Oct. 2015 1 Introduction In the hardware-assisted virtualization technique we try to execute the instructions of the target machine directly on the host

More information

Kernel Virtual Machine

Kernel Virtual Machine Kernel Virtual Machine Shashank Rachamalla Indian Institute of Technology Dept. of Computer Science November 24, 2011 Abstract KVM(Kernel-based Virtual Machine) is a full virtualization solution for x86

More information

THE BASICS OF PERFORMANCE- MONITORING HARDWARE

THE BASICS OF PERFORMANCE- MONITORING HARDWARE THE BASICS OF PERFORMANCE- MONITORING HARDWARE PERFORMANCE-MONITORING FEATURES PROVIDE DATA THAT DESCRIBE HOW AN APPLICATION AND THE OPERATING SYSTEM ARE PERFORMING ON THE PROCESSOR. THIS INFORMATION CAN

More information

Operating System Organization. Purpose of an OS

Operating System Organization. Purpose of an OS Slide 3-1 Operating System Organization Purpose of an OS Slide 3-2 es Coordinate Use of the Abstractions he Abstractions Create the Abstractions 1 OS Requirements Slide 3-3 Provide resource abstractions

More information

Operating Systems. 05. Threads. Paul Krzyzanowski. Rutgers University. Spring 2015

Operating Systems. 05. Threads. Paul Krzyzanowski. Rutgers University. Spring 2015 Operating Systems 05. Threads Paul Krzyzanowski Rutgers University Spring 2015 February 9, 2015 2014-2015 Paul Krzyzanowski 1 Thread of execution Single sequence of instructions Pointed to by the program

More information

Virtual Private Systems for FreeBSD

Virtual Private Systems for FreeBSD Virtual Private Systems for FreeBSD Klaus P. Ohrhallinger 06. June 2010 Abstract Virtual Private Systems for FreeBSD (VPS) is a novel virtualization implementation which is based on the operating system

More information

Twitter and Email Notifications of Linux Server Events

Twitter and Email Notifications of Linux Server Events NOTIFICARME Twitter and Email Notifications of Linux Server Events Chitresh Kakwani Kapil Ratnani Nirankar Singh Ravi Kumar Kothuri Vamshi Krishna Reddy V [email protected] [email protected]

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 3, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation

Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation Automatic Logging of Operating System Effects to Guide Application-Level Architecture Simulation Satish Narayanasamy, Cristiano Pereira, Harish Patil, Robert Cohn, and Brad Calder Computer Science and

More information

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications ECE6102 Dependable Distribute Systems, Fall2010 EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications Deepal Jayasinghe, Hyojun Kim, Mohammad M. Hossain, Ali Payani

More information

Linux/ia64 support for performance monitoring

Linux/ia64 support for performance monitoring Linux/ia64 support for performance monitoring Stéphane Eranian HP Labs Gelato Meeting, May 2004 UIUC, IL 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change

More information

How To Monitor Performance On A Microsoft Powerbook (Powerbook) On A Network (Powerbus) On An Uniden (Powergen) With A Microsatellite) On The Microsonde (Powerstation) On Your Computer (Power

How To Monitor Performance On A Microsoft Powerbook (Powerbook) On A Network (Powerbus) On An Uniden (Powergen) With A Microsatellite) On The Microsonde (Powerstation) On Your Computer (Power A Topology-Aware Performance Monitoring Tool for Shared Resource Management in Multicore Systems TADaaM Team - Nicolas Denoyelle - Brice Goglin - Emmanuel Jeannot August 24, 2015 1. Context/Motivations

More information

Testing Database Performance with HelperCore on Multi-Core Processors

Testing Database Performance with HelperCore on Multi-Core Processors Project Report on Testing Database Performance with HelperCore on Multi-Core Processors Submitted by Mayuresh P. Kunjir M.E. (CSA) Mahesh R. Bale M.E. (CSA) Under Guidance of Dr. T. Matthew Jacob Problem

More information

Chapter 5 Cloud Resource Virtualization

Chapter 5 Cloud Resource Virtualization Chapter 5 Cloud Resource Virtualization Contents Virtualization. Layering and virtualization. Virtual machine monitor. Virtual machine. Performance and security isolation. Architectural support for virtualization.

More information

Overview of the Cortex-M3

Overview of the Cortex-M3 CHAPTER Overview of the Cortex-M3 2 In This Chapter Fundamentals 11 Registers 12 Operation Modes 14 The Built-In Nested Vectored Interrupt Controller 15 The Memory Map 16 The Bus Interface 17 The MPU 18

More information

(Refer Slide Time: 00:01:16 min)

(Refer Slide Time: 00:01:16 min) Digital Computer Organization Prof. P. K. Biswas Department of Electronic & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture No. # 04 CPU Design: Tirning & Control

More information

Programming Guide. Intel Microarchitecture Codename Nehalem Performance Monitoring Unit Programming Guide (Nehalem Core PMU)

Programming Guide. Intel Microarchitecture Codename Nehalem Performance Monitoring Unit Programming Guide (Nehalem Core PMU) Programming Guide Intel Microarchitecture Codename Nehalem Performance Monitoring Unit Programming Guide (Nehalem Core PMU) Table of Contents 1. About this document... 8 2. Nehalem-based PMU Architecture...

More information

Migration of Process Credentials

Migration of Process Credentials C H A P T E R - 5 Migration of Process Credentials 5.1 Introduction 5.2 The Process Identifier 5.3 The Mechanism 5.4 Concluding Remarks 100 CHAPTER 5 Migration of Process Credentials 5.1 Introduction Every

More information

Eloquence Training What s new in Eloquence B.08.00

Eloquence Training What s new in Eloquence B.08.00 Eloquence Training What s new in Eloquence B.08.00 2010 Marxmeier Software AG Rev:100727 Overview Released December 2008 Supported until November 2013 Supports 32-bit and 64-bit platforms HP-UX Itanium

More information

Real-Time Systems Prof. Dr. Rajib Mall Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Real-Time Systems Prof. Dr. Rajib Mall Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Real-Time Systems Prof. Dr. Rajib Mall Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 26 Real - Time POSIX. (Contd.) Ok Good morning, so let us get

More information

CSC 2405: Computer Systems II

CSC 2405: Computer Systems II CSC 2405: Computer Systems II Spring 2013 (TR 8:30-9:45 in G86) Mirela Damian http://www.csc.villanova.edu/~mdamian/csc2405/ Introductions Mirela Damian Room 167A in the Mendel Science Building [email protected]

More information

Kernel comparison of OpenSolaris, Windows Vista and. Linux 2.6

Kernel comparison of OpenSolaris, Windows Vista and. Linux 2.6 Kernel comparison of OpenSolaris, Windows Vista and Linux 2.6 The idea of writing this paper is evoked by Max Bruning's view on Solaris, BSD and Linux. The comparison of advantages and disadvantages among

More information

Chapter 3: Operating-System Structures. System Components Operating System Services System Calls System Programs System Structure Virtual Machines

Chapter 3: Operating-System Structures. System Components Operating System Services System Calls System Programs System Structure Virtual Machines Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines Operating System Concepts 3.1 Common System Components

More information

Transparent ROP Detection using CPU Performance Counters. 他 山 之 石, 可 以 攻 玉 Stones from other hills may serve to polish jade

Transparent ROP Detection using CPU Performance Counters. 他 山 之 石, 可 以 攻 玉 Stones from other hills may serve to polish jade Transparent ROP Detection using CPU Performance Counters 他 山 之 石, 可 以 攻 玉 Stones from other hills may serve to polish jade Xiaoning Li Michael Crouse Intel Labs Harvard University THREADS Conference 2014

More information

Large-scale performance monitoring framework for cloud monitoring. Live Trace Reading and Processing

Large-scale performance monitoring framework for cloud monitoring. Live Trace Reading and Processing Large-scale performance monitoring framework for cloud monitoring Live Trace Reading and Processing Julien Desfossez Michel Dagenais May 2014 École Polytechnique de Montreal Live Trace Reading Read the

More information

MPLAB Harmony System Service Libraries Help

MPLAB Harmony System Service Libraries Help MPLAB Harmony System Service Libraries Help MPLAB Harmony Integrated Software Framework v1.08 All rights reserved. This section provides descriptions of the System Service libraries that are available

More information

The Microsoft Windows Hypervisor High Level Architecture

The Microsoft Windows Hypervisor High Level Architecture The Microsoft Windows Hypervisor High Level Architecture September 21, 2007 Abstract The Microsoft Windows hypervisor brings new virtualization capabilities to the Windows Server operating system. Its

More information

ext4 online defragmentation

ext4 online defragmentation ext4 online defragmentation Takashi Sato NEC Software Tohoku, Ltd. [email protected] Abstract ext4 greatly extends the filesystem size to 1024PB compared to 16TB in ext3, and it is capable of storing

More information