Performance Testing for GDB



Similar documents
Real-time Debugging using GDB Tracepoints and other Eclipse features

INTRODUCTION TO OBJECTIVE-C CSCI 4448/5448: OBJECT-ORIENTED ANALYSIS & DESIGN LECTURE 12 09/29/2011

Development_Setting. Step I: Create an Android Project

CS 378 Big Data Programming. Lecture 9 Complex Writable Types

10 STEPS TO YOUR FIRST QNX PROGRAM. QUICKSTART GUIDE Second Edition

TRACE PERFORMANCE TESTING APPROACH. Overview. Approach. Flow. Attributes

Hands-on CUDA exercises

Serena Business Manager Performance Test Results

Q N X S O F T W A R E D E V E L O P M E N T P L A T F O R M v Steps to Developing a QNX Program Quickstart Guide

Bug hunting. Vulnerability finding methods in Windows 32 environments compared. FX of Phenoelit

Integrating SNiFF+ with the Data Display Debugger (DDD)

Yubico YubiHSM Monitor

Incremental Backup Script. Jason Healy, Director of Networks and Systems

GPU Profiling with AMD CodeXL

Control III Programming in C (small PLC)

Interpreters and virtual machines. Interpreters. Interpreters. Why interpreters? Tree-based interpreters. Text-based interpreters

WINDOWS PROCESSES AND SERVICES

SRUM forensics. Yogesh Khatri Champlain College

DS-5 ARM. Using the Debugger. Version 5.7. Copyright 2010, 2011 ARM. All rights reserved. ARM DUI 0446G (ID092311)

How To Test Your Code On A Cuda Gdb (Gdb) On A Linux Computer With A Gbd (Gbd) And Gbbd Gbdu (Gdb) (Gdu) (Cuda

HOW I MET YOUR MODEM EXPLOIT & TROJAN DEV FOR CONSUMER DSL DEVICES HACK IN THE BOX 2013 AMSTERDAM - PETER GEISSLER & STEVEN KETELAAR

Introduction to Synoptic

Packet Sniffer using Multicore programming. By B.A.Khivsara Assistant Professor Computer Department SNJB s KBJ COE,Chandwad

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. P. N. Hilfinger

Introducing Tetra: An Educational Parallel Programming System

HIPAA Compliance Use Case

Basics of VTune Performance Analyzer. Intel Software College. Objectives. VTune Performance Analyzer. Agenda

Red Hat Linux Internals

RTOS Debugger for ecos

Getting Started with the Internet Communications Engine

Application Compatibility Best Practices for Remote Desktop Services

MatrixSSL Developer s Guide

DevKey Documentation. Release 0.1. Colm O Connor

Channel Access Client Programming. Andrew Johnson Computer Scientist, AES-SSG

PERFORMANCE TOOLS DEVELOPMENTS

C Programming Review & Productivity Tools

Open Mic on IBM Notes Traveler Best Practices. Date: 11 July, 2013

CSC 2405: Computer Systems II

Lab 5: BitTorrent Client Implementation

ML310 Creating a VxWorks BSP and System Image for the Base XPS Design

Performance Tuning Guide for ECM 2.0

Oracle Database 12c: Performance Management and Tuning NEW

Android NDK Native Development Kit

AQA GCSE in Computer Science Computer Science Microsoft IT Academy Mapping

Cisco Local Director Abstract. Stephen Gill Revision: 1.0, 04/18/2001

20 Command Line Tools to Monitor Linux Performance

Compilers and Tools for Software Stack Optimisation

Mobile Application Hacking for Android and iphone. 4-Day Hands-On Course. Syllabus

JavaCard. Java Card - old vs new

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY Operating System Engineering: Fall 2005

Automated Performance Testing of Desktop Applications

The Advanced JTAG Bridge. Nathan Yawn 05/12/09

Start Oracle Insurance Policy Administration. Activity Processing. Version

EZManage V4.0 Release Notes. Document revision 1.08 ( )

Python for Series 60 Platform

First Java Programs. V. Paúl Pauca. CSC 111D Fall, Department of Computer Science Wake Forest University. Introduction to Computer Science

FreeBSD Developer Summit TrustedBSD: Audit + priv(9)

Testing for Security

Services. Relational. Databases & JDBC. Today. Relational. Databases SQL JDBC. Next Time. Services. Relational. Databases & JDBC. Today.

B&K Precision 1785B, 1786B, 1787B, 1788 Power supply Python Library

Dongwoo Kim : Hyeon-jeong Lee s Husband

Managing Storage Services Modules

GDB Tutorial. A Walkthrough with Examples. CMSC Spring Last modified March 22, GDB Tutorial

Getting Things Done: Practical Web/e-Commerce Application Stress Testing

CSE 141L Computer Architecture Lab Fall Lecture 2

COMMANDS 1 Overview... 1 Default Commands... 2 Creating a Script from a Command Document Revision History... 10

Lab 5 Using Remote Worklight Server

CS3600 SYSTEMS AND NETWORKS

Cosmic Board for phycore AM335x System on Module and Carrier Board. Application Development User Manual

Linux Load Balancing

TLS all the tubes! TLS Fast Yet? IsWebRTC. It can be. Making TLS fast(er)... the nuts and bolts. +Ilya

Python Programming: An Introduction to Computer Science

System/Networking performance analytics with perf. Hannes Frederic Sowa

Using the CoreSight ITM for debug and testing in RTX applications

Connect for iphone. Aug, 2012 Ver 5.3b AWest. 1 P age

Tutorial: Packaging your server build

Jonathan Worthington Scarborough Linux User Group

PATROL for Microsoft Windows Servers v Reviewer s Guide

Custom Penetration Testing

An Introduction to High Performance Computing in the Department

Debugging with TotalView

Symantec Endpoint Protection Shared Insight Cache User Guide

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

MPLAB Harmony System Service Libraries Help

Operating Systems and Networks

Debugging Multi-threaded Applications in Windows

B M C S O F T W A R E, I N C. BASIC BEST PRACTICES. Ross Cochran Principal SW Consultant

SkyRecon Cryptographic Module (SCM)

Using The Paessler PRTG Traffic Grapher In a Cisco Wide Area Application Services Proof of Concept

User-level processes (clients) request services from the kernel (server) via special protected procedure calls

DS-5 ARM. Using the Debugger. Version Copyright ARM. All rights reserved. ARM DUI 0446M (ID120712)

OPERATING SYSTEM SERVICES

L04 C Shell Scripting - Part 2

Monitoring, Tracing, Debugging (Under Construction)

Rational Developer for IBM i (RDI) Distance Learning hands-on Labs IBM Rational Developer for i. Maintain an ILE RPG application using


PERFORMANCE TUNING IN MICROSOFT SQL SERVER DBMS

<Insert Picture Here> Tracing on Linux

Module 26. Penetration Testing

Transcription:

Performance Testing for GDB Yao Qi yao@codesourcery.com CodeSourcery/MentorGraphics 214-9-13

1 Overview Introduction Goals 2 3

Introduction Goals Introduction We need something to test and measure GDB perf first, No standard mechanism to show the performance, source:accurate-measurement.jpg Proposed the perf testing framework in Aug and Sep 213, Write some perf test cases and patches for perf improvements from Sep to Nov. if you cannot measure it, you cannot improve it. Lord Kelvin Thank Doug and Sanimir for the comments and patient review.

Goals Overview Introduction Goals Know the new perf testing framework in GDB and how to write perf test cases on top of it, Demonstrate how do they show the perf improvements, CPUT time (s) 4 3 3 2 2 1 original patched 1 1 2 3 4 Number of single steps Show how do we use it to track GDB performance

Overview easy to write test case for a certain aspect, i.e. symbol lookup or single step basic measurements are provided (such as cpu time and memory usage), but test specific measurement can be added too. both native debugging and remote debugging are supported, test case has warm up optionally, save results in different formats, the executable in test case can be pre-compiled, source:shelter1.jpg

Use DejaGNU to invoke compiler and start GDB (and/or GDBserver), GDB loads a python script in which some operations are performed, The framework collects the performance data and save them in the desired format, Detect performance regressions, compile test case gdb.perf/foo.c start gdb and gdbserver load program run gdb.perf/foo.py lib/gdb.exp lib/perftest.exp gdb.perf/foo.exp

: skip-prologue.exp 1 PerfTest::assemble { 2 if { [ gdb_compile " $srcdir / $subdir / $srcfile " ${ binfile } executable { debug }]!= "" } { 3 return -1 4 } COMPILE 6 return 7 } { 8 clean_restart $binfile START UP 9 1 if![ runto_main ] { 11 fail "Can t run to main " 12 return -1 13 } 14 } { 1 global SKIP_PROLOGUE_COUNT 16 RUN 17 set test " run " 18 gdb_test_multiple " python SkipPrologue \( $SKIP_PROLOGUE_COUNT \).run () " $test { 19 -re " Breakpoint $decimal at \[^\ n\] *\n" { 2 # GDB prints some messages on breakpoint creation. 21 # Consume the output to avoid internal buffer full. 22 exp_continue 23 } 24 -re ".*$gdb_prompt $" { 2 pass $test 26 } 27 } 28 }

: skip-prologue.py 1 from perftest import perftest 2 3 class SkipPrologue ( perftest. TestCaseWithBasicMeasurements): 4 def init (self, count ): super( SkipPrologue, self ). init ("skip - prologue ") 6 self. count = count 7 Extend from 8 def _test ( self ): TestCaseWithBasicMeasurements 9 for _ in range (1, self. count ): 1 # Insert breakpoints on function f1 and f2. 11 bp1 = gdb. Breakpoint ("f1 ") 12 bp2 = gdb. Breakpoint ("f2 ") 13 # Remove them. 14 bp1. delete () 1 bp2. delete () Test 16 17 def warm_up ( self ): 18 self. _test () 19 Measure the test 2 def execute_test ( self ): 21 for i in range (1, 4): 22 gdb. execute ("set code - cache off ") 23 gdb. execute ("set code - cache on ") 24 self. measure. measure ( self._test, i) 1 TestCase name 9 measure exuecte testcase () run () 8 warm up () 7 6 TestCaseWithBasicMeasurements 4 3 2 SkipPrologue 1 warm up () execute test () 1 2 3 4 6 7 8

Cache code access for skip-prologue Before After read some bytes RSP target read some bytes read target dcache RSP target parse the instruction parse the instruction 6 Original Patched yes is it part of prologue? yes is it part of prologue? no stop no stop CPU time (s) 4 3 2 1 note that only x86 and x86 64 make use that. 1 2 3 4 SKIP_PROLOGUE_COUNT

Cache code access for disassemble 9 8 Original Patched 3. 3 Original Patched Number of m packets 7 6 4 3 2 CPU time (s) 2. 2 1. 1 1. evaluate_subexp_standardhandle_inferior_event c_parse_internal evaluate_subexp_standard handle_inferior_event c_parse_internal

My workflow of improving perf Examine the RSP log and locate some unnecessary packets (gdb) si Sending packet: $qtstatus#49... Sending packet: $Z,8483c7,1#b4...Packet received: OK Sending packet: $Z,4ceb6b,1#6e...Packet received: OK Sending packet: $QPassSignals:... Sending packet: $vcont;s:p1b1.1b1;c#2... Sending packet: $qxfer:traceframe-info:read::,fff#b...packet received: E1 Sending packet: $mbfffef4,4#c... Sending packet: $qxfer:traceframe-info:read::,fff#b...packet received: E1 Sending packet: $qxfer:traceframe-info:read::,fff#b...packet received: E1 Sending packet: $qxfer:traceframe-info:read::,fff#b...packet received: E1 Sending packet: $z,8483c7,1#d4...packet received: OK Sending packet: $z,4ceb6b,1#8e...packet received: OK Sending packet: $qxfer:traceframe-info:read::,fff#b...packet received: E1 Sending packet: $qxfer:traceframe-info:read::,fff#b...packet received: E1 Sending packet: $qxfer:traceframe-info:read::,fff#b...packet received: E1 Sending packet: $qxfer:traceframe-info:read::,fff#b...packet received: E1 Sending packet: $qxfer:traceframe-info:read::,fff#b...packet received: E1 Sending packet: $qxfer:traceframe-info:read::,fff#b...packet received: E1 Investigate and hack the code, Show the perf improved by the patch, CPUT time (s) 4 3 3 2 2 1 1 original patched Typical profling doesn t help - 2.91% gdbserver libc-2.14.9.so [.] strcmp_sse4_2 strcmp_sse4_2-1.92% gdbserver libc-2.14.9.so [.] vfprintf - vfprintf - 74.4% _IO_vsprintf 2.6% handle_tracepoint_query + 47.4% x1 + 24.26% _IO_vsnprintf 1 2 3 4 Number of single steps

Read multiple cache lines in dcache xfer memory Before After 6 Original Patched 3 Original Patched 3 Number of m packets 4 3 2 1 CPU Time (s) 2 2 1 1 16 32 64 128 26 12 124 248 Cache line size 2 4 8 16 32 64 128 26 12 124 248 Cache line size

Track and compare GDB performance Still have no way to track and compare perf data A web application to monitor and analyze the performance, Written by python and easy to set up, It doesn t meet our needs perfectly, and it is not actively maintained,

Observations and future work observations Command thread doesn t scale, Switch thread by command thread N CPU Time (s) 4 3 3 2 2 1 1 Actual dcache is conservative for multi-threaded code, 1 enum target_xfer_status 2 dcache_read_memory_partial ( struct target_ops *ops, DCACHE * dcache, 3 CORE_ADDR memaddr, gdb_byte * myaddr, 4 ULONGEST len, ULONGEST * xfered_len ) { 6 /* If this is a different inferior from what we ve recorded, 7 flush the cache. */ 8 9 (! ptid_equal ( inferior_ptid if, dcache -> ptid )) 1 { 11 dcache_invalidate ( dcache ); 12 dcache -> ptid = inferior_ptid ; Remove? 13 } 1 2 3 4 6 7 8 9 1 11 12 13 14 1 16 17 18 19 2 21 22 23 24 Number of threads 3 2 Current(remote) Patched(remote) Disassemble across threads 1 static int 2 do_captured_thread_select ( struct ui_out * uiout, void * tidstr ) 3 { 4... /* Since the current thread may have changed, see if there is any 6 exited thread we can now delete. */ 7 prune_threads (); 8 9 return GDB_RC_OK ; 1 } CPU Time (s) 2 1 1 1 2 3 4 6 7 8 9 1 11 12 13 14 1 16 17 18 19 2 21 22 23 24 Number of threads future work Compare perf data and alarm if regression is found. More perf test cases

Framework Overview