vmprof Documentation Release 0.1 Maciej Fijalkowski, Antonio Cuni, Sebastian Pawlus

Similar documents
Java Troubleshooting and Performance

Basics of VTune Performance Analyzer. Intel Software College. Objectives. VTune Performance Analyzer. Agenda

New Relic & JMeter - Perfect Performance Testing

IBM SDK, Java Technology Edition Version 1. IBM JVM messages IBM

Zing Vision. Answering your toughest production Java performance questions

Enterprise Manager Performance Tips

Network Connect Performance Logs on MAC OS

RTI Quick Start Guide for JBoss Operations Network Users

RTI Quick Start Guide

Kernel comparison of OpenSolaris, Windows Vista and. Linux 2.6

Determine the process of extracting monitoring information in Sun ONE Application Server

Monitoring, Tracing, Debugging (Under Construction)

CDH installation & Application Test Report

WebSphere Commerce V7 Feature Pack 3

LAVA Project Update. Paul Larson

DevKey Documentation. Release 0.1. Colm O Connor

Network Connect & Junos Pulse Performance Logs on Windows

MarkLogic Server. Connector for SharePoint Administrator s Guide. MarkLogic 8 February, 2015

id_prob_result_coredump_aix.ppt Page 1 of 15

CA Nimsoft Monitor. Probe Guide for Apache HTTP Server Monitoring. apache v1.5 series

IERG 4080 Building Scalable Internet-based Services

Chapter 2 System Structures

How To Gather Log Files On A Pulse Secure Server On A Pc Or Ipad (For A Free Download) On A Network Or Ipa (For Free) On An Ipa Or Ipv (For An Ubuntu) On Your Pc

Example of Standard API

Online Backup Client User Manual

NetBeans Profiler is an

Trace-Based and Sample-Based Profiling in Rational Application Developer

MuL SDN Controller HOWTO for pre-packaged VM

CPSC 2800 Linux Hands-on Lab #7 on Linux Utilities. Project 7-1

CA Nimsoft Monitor. Probe Guide for URL Endpoint Response Monitoring. url_response v4.1 series

Tuning WebSphere Application Server ND 7.0. Royal Cyber Inc.

ReproLite : A Lightweight Tool to Quickly Reproduce Hard System Bugs

Oracle Corporation Proprietary and Confidential

RecoveryVault Express Client User Manual

IVR Studio 3.0 Guide. May Knowlarity Product Team

NaviCell Data Visualization Python API

Django Two-Factor Authentication Documentation

CLC Server Command Line Tools USER MANUAL

Smarter Balanced Reporting (RFP 15) Developer Guide

Instrumentation Software Profiling

Using jvmstat and visualgc to Solve Memory Management Problems

Online Backup Linux Client User Manual

Emerald. Network Collector Version 4.0. Emerald Management Suite IEA Software, Inc.

Performance Monitoring API for Java Enterprise Applications

find model parameters, to validate models, and to develop inputs for models. c 1994 Raj Jain 7.1

ALERT installation setup

Troubleshooting PHP Issues with Zend Server Code Tracing

Online Backup Client User Manual

Chapter 3 Operating-System Structures

Deploying Microsoft Operations Manager with the BIG-IP system and icontrol

PTC System Monitor Solution Training

Configuring System Message Logging

1. Product Information

Advanced Performance Forensics

Python Driver 1.0 for Apache Cassandra

IBM Support Assistant v5. Review and hands-on by Joseph

LinuxCon Europe Cloud Monitoring and Distribution Bug Reporting with Live Streaming and Snapshots.

Online Backup Client User Manual Linux

Efficient and Large-Scale Infrastructure Monitoring with Tracing

Rebuild Perfume With Python and PyPI

RTI v3.3 Lightweight Deep Diagnostics for LoadRunner

Creating a DUO MFA Service in AWS

Troubleshooting for Yamaha router

Subversion Integration

OPERATING SYSTEM SERVICES

Using the VCDS Application Monitoring Tool

netflow-indexer Documentation

mypro Installation and Handling Manual Version: 7

Design and Implementation of a Real-Time Cloud Analytics Platform

ASR 5x00 Series SGSN Authentication and PTMSI Reallocation Best Practices

pbuilder Debian Conference 2004

Monitoring and Managing a JVM

User-ID Best Practices

High Performance Computing in Aachen

PANIC, The ALBA Alarm System

How to extend Puppet using Ruby

Simulation of wireless ad-hoc sensor networks with QualNet

HeapStats: Your Dependable Helper for Java Applications, from Development to Operation

AdRadionet to IBM Bluemix Connectivity Quickstart User Guide

An Introduction to Using Python with Microsoft Azure

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire 25th

How to Run Spark Application

IBM Software Group. SW5706 JVM Tools IBM Corporation 4.0. This presentation will act as an introduction to JVM tools.

USER GUIDE. Diagnostic Web Server FW ver BrightSign, LLC Lark Ave., Suite B Los Gatos, CA

Project 4: SDNs Due: 11:59 PM, Dec 11, 2014

System Copy GT Manual 1.8 Last update: 2015/07/13 Basis Technologies

IceWarp to IceWarp Server Migration

PZVM1 Administration Guide. V1.1 February 2014 Alain Ganuchaud. Page 1/27

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers

A Study of Performance Monitoring Unit, perf and perf_events subsystem

Tuning Tableau Server for High Performance

Nikhil s Web Development Helper

Course Description. Course Audience. Course Outline. Course Page - Page 1 of 5

About the VM-Series Firewall

Building Applications Using Micro Focus COBOL

vrealize Operations Manager Customization and Administration Guide

Bitrix Site Manager ASP.NET. Installation Guide

Contents. 2. cttctx Performance Test Utility Server Side Plug-In Index All Rights Reserved.

IBM Software Services for Lotus Consulting Education Accelerated Value Program. Log Files IBM Corporation

Performance of VMware vcenter (VC) Operations in a ROBO Environment TECHNICAL WHITE PAPER

Transcription:

vmprof Documentation Release 0.1 Maciej Fijalkowski, Antonio Cuni, Sebastian Pawlus January 23, 2016

Contents 1 Introduction 1 1.1 Requirements............................................... 1 1.2 Installation................................................ 1 1.3 Usage................................................... 2 2 API 5 2.1 Module level functions.......................................... 5 2.2 Stats object.............................................. 5 2.3 Tree object............................................... 5 3 Why a new profiler? 7 4 How does it work? 9 i

ii

CHAPTER 1 Introduction vmprof is a lightweight profiler for CPython 2.7, CPython 3 and PyPy. It helps you to find and understand the performance bottlenecks in your code. vmprof is a statistical profiler: it gathers information about your code by continuously taking samples of the call stack of the running program, at a given frequency. This is similar to tools like vtune or gperftools: the main difference is that those tools target C and C-like languages and are not very helpful to profile higher-level languages which run on top of a virtual machine, while vmprof is designed specifically for them. There are three primary modes. The recommended one is to use our server infrastructure for a web-based visualization of the result: python -m vmprof --web <program.py> <program parameters> If you prefer a barebone terminal-based visualization, which will display only some basic statistics: python -m vmprof <program.py> <program parameters> To display a terminal-based tree of calls: python -m vmprof -o output.log <program.py> <program parameters> vmprofshow output.log To upload an already saved profile log to the vmprof web server: python -m vmprof.upload output.log For more advanced use cases, vmprof can be invoked and controlled from within the program using the given API. 1.1 Requirements vmprof 0.2 works only on x86_64 and x86 linux, Mac OS X and x86 windows. It works on CPython 2.7, 3.4 and 3.5. PyPy support is already on a branch, but you will need to wait for PyPy 4.1 release to have it officially supported. 1.2 Installation Installation of vmprof is performed with a simple command: pip install vmprof 1

vmprof Documentation, Release 0.1 Since it depends on some C code for CPython, you need a compiler. sudo apt-get install python-dev 1.3 Usage Main usage of vmprof is via command line. The following shows the basic usage to profile an example x.py program: fijal@hermann:~/src/vmprof-python$ cat x.py def g(i, s): s += i return s def h(i, s): return g(i, s) + 3 def f(): i = 0 s = 0 while i < 10000000: s = h(i, s) i += 1 if name == ' main ': f() fijal@hermann:~/src/vmprof-python$ python -m vmprof x.py vmprof output: % of snapshots: name: 100.0% <module> x.py:2 100.0% f x.py:9 55.0% h x.py:6 14.4% g x.py:2 We stronly suggest using the --web option that will display you a much nicer web interface hosted on vmprof.com. If you prefer to host your own vmprof visualization server, you need the vmprof-server package. After -m vmprof you can specify some options: --web - Use the web-based visualization. By default, the result can be viewed on our server. --web-url - the URL to upload the profiling info as JSON. The default is vmprof.com --web-auth - auth token for user name support in the server. -p period - float that gives you how often the profiling happens (the max is about 300 Hz, rather don t touch it). -n - enable all C frames, only useful if you have a debug build of PyPy or CPython. -o file - save logs for later --help - display help --config - a ini format config file with all options presented above. When passing a config file along with command line arguments, the command line arguments will take precedence and override the config file values. Example config.ini file: 2 Chapter 1. Introduction

vmprof Documentation, Release 0.1 web-url = vmprof.com web-auth = ffb7d4bee2d6436bbe97e4d191bf7d23f85dfeb2 period = 0.1 1.3. Usage 3

vmprof Documentation, Release 0.1 4 Chapter 1. Introduction

CHAPTER 2 API There is also an API that can bring more details to the table, but consider it unstable. The current API usage is as follows: 2.1 Module level functions vmprof.enable(fileno, period=0.01) - enable writing vmprof data to a file described by a fileno file descriptor. Timeout is in float seconds. The minimal available resolution is 4ms, we re working on improving that (note the default is 10ms) vmprof.disable() - finish writing vmprof data, disable the signal handler vmprof.read_profile(filename) - read vmprof data from filename and return Stats instance. 2.2 Stats object Stats object gives you an overview of data: stats.get_tree() - Gives you a tree of objects 2.3 Tree object Tree is made of Nodes, each node supports at least the following interface: node[key] - a fuzzy search of keys (first match) repr(node) - basic details node.flatten() - returns a new tree that flattens all the metadata (gc, blackhole etc.) node.walk(callback) - call a callable of form callback(root) that will be invoked on each node 5

vmprof Documentation, Release 0.1 6 Chapter 2. API

CHAPTER 3 Why a new profiler? There is a variety of python profilers on the market: CProfile is the one bundled with CPython, which together with lsprofcalltree.py provides good info and decent visualization; plop is an example of statistical profiler. We wanted a profiler with the following characteristics: Minimal overhead, small enough that enabling the profiler in production is a viable option. Ideally the overhead should be in the range 1-5%, with the possibility to tune it for more accurate measurments Ability to display a full stack of calls, so it can show how much time was spent in a function, including all its children Good integration with PyPy: in particular, it must be aware of the underlying JIT, and be able to show how much time is spent inside JITted code, Garbage collector and normal intepretation. None of the existing solutions satisfied our requirements, hence we decided to create our own profiler. In particular, cprofile is slow on PyPy, does not understand the JITted code very well and is shown in the JIT traces. 7

vmprof Documentation, Release 0.1 8 Chapter 3. Why a new profiler?

CHAPTER 4 How does it work? As most statistical profilers, the core idea is to have a signal handler which periodically inspects and dumps the stack of the running program: the most frequently executed parts of the code will be dumped more often, and the postprocessing and visualization tools have the chance to show the end user usueful info about the behavior of the profiled program. This is the very same approach used e.g. by gperftools. However, when profiling an interpreter such as CPython, inspecting the C stack is not enough, because most of the time will always be spent inside the opcode dispatching loop of the virtual machine (e.g., PyEval_EvalFrameEx in case of CPython). To be able to display useful information, we need to know which Python-level function correspond to each C-level PyEval_EvalFrameEx. This is done by reading the stack of Python frames instead of C stack. Additionally, when on top of PyPy the C stack contains also stack frames which belong to the JITted code: the vmprof signal handler is able to recognize and extract the relevant info from those as well. Once we have gathered all the low-level info, we can post-process and visualize them in various ways: for example, we can decide to filter out the places where we are inside the select() syscall, etc. The machinery to gather the information has been the focus of the initial phase of vmprof development and now it is working well: we are currently focusing on the frontend to make sure we can process and display the info in useful ways. 9