Performance Analysis and Visualization of SystemC Models Adam Donlin and Thomas Lenart Xilinx Research
Overview Performance Analysis!= Functional Verification Analysis and Visualization Overview Simulation Architecture Tcl/Tk Integration Data Member Accessors Performance Monitors Visualization Gallery Simulation Efficiency Comparing Performance Monitors with SCV data logging 2
What do we want to analyze? 3
SystemC Performance Analysis Phases Model Instrumentation Annotate timed functional model with performance monitors Scenario Definition Calibrate and configure models Select benchmark applications Data Collection During simulation collect information on the specific events that have occurred Data Visualization Show the performance trends of the system to the user with intuitive visualizations of the data Model Instrumentation Scenario Definition Data Collection Data Visualization During Model Design During Simulation 4
Simulator Architecture for Performance Analysis Separate GUI/Control from actual Simulation Interactive Keep designer in the loop Multi-threaded architecture Tcl/Tk interfaces directly to models and monitors Represent each SC_MODULE with a distinct Tcl command Hierarchical command names allow access to sub-modules Member function inside model implements Tcl command Simulation Models (OSCI SystemC) cmd> top.up get PC SP REG1 cmd> top.up get PC 0xffffffff up 0xffffffff Mem Peri Tk/Tcl Simulation Control and Model Visualization 5
Dispatching Tcl Commands to SystemC Modules Each module constructor registers a Tcl command Tcl_CreateCommand( interp, cmdname, funcptr, clientdata, deleteproc); Problem: Tcl does not support C++ Member function pointers for commands Solution: Indirect through static function Pass reference to member function as clientdata parameter Common Tcl_base class required for each SC_Module Class inheritance becomes significant Simulation Models OPB_Slave up Mem Peri Tcl_Main() XC_Tcl_Base Read Exec Virtual Int Dispatch(args)=0 Parse OPB_Slave(params) Virtual Int Dispatch(args) { } XC_Cmd_Dispatch SC_MODULE XC_Run XC_Load_System 6
Data Member Accessors Provide Direct Tcl Access to Model Variables Scenario setup and control <model> get <var> <model> set <var> <value> Based on STL::Map associative container Automatically extract and visualize the parameter sets for a model in Tk In the code: SC_MODULE(ublaze) { xc_access_flags vars; SC_CTOR(uP) { } XC_TCL_ACCESSOR(vars, PC); XC_TCL_SYM_ACCESSOR(vars, _MSR, Machine Status Register ); virtual int Dispatch(args) { } // Implement accessor set and -get xc_tcl_accessor_parser(args, vars); // Local command parsing } From the command line: cmd> top.ublaze get PC 0xffffffff cmd> top.ublaze set PC 0x30000000 7
Performance Monitors Sample and store performance data Configurable width, depth, sampling period, output format Leverage STL containers for sample managment Macros in functional code abstract away details of monitor s implementation Implemented as SystemC modules Instantiate and name like any other module Integrated with SystemC timing model (periodic sampling) Integrated with Tcl Each monitor is also a Tcl command Parameters exported as data accessors In the model: Notifications from Timed Functional Model SC_MODULE(mem) { } xc_monitor<unsigned long> read_mon; SC_CTOR(mem) : } read_mon( read_mon ) { virtual int mem_read { } Current samples Trigger Capture // Update the PC value for the next instr. XC_NOTIFY(read_mon, read_addr); Historical samples In the GUI: cmd> top.mem.read_mon set width 0x200 cmd> top.mem.read_mon format xy Samples gathered by Tk/Tcl for visualization 8
Performance Monitors System Model up Mem Cache Monitor Trigger cmd> top.up.cache -read dispatch data Current samples Historical samples Simulation Thread Tcl/Tk Environment Thread 9
Ingredients: Gallery of Visualizations OSCI SystemC 2.1v1 Tcl/TK v8.4 Command line and GUI scripting environment BLT v2.4 Graph and GUI infrastructure widget set http://blt.sourceforge.net/ TkTable v2.9 and vu v2.1.0 Tabular/Spreadsheet widget Pie chart and utility widgets http://tktable.sourceforge.net/ Framework Tcl Scripts Basic framework (250 lines) Graph utility functions (150 lines) Table utility functions (60 lines) GUI management functions (180 lines) 10
Bus visualization Bus utilization when caches are turned on or off Instructions vs. data fetch ratio Bus congestion Arbitration count and delay Distribution of total bus capacity Script size: 130 Lines 11
CPU visualization Cache statistics Instructions hit / miss Data hit / miss / invalidates CPU information Statistics about program execution Registers Stack pointer tracing Trace stack related problems Show name and time stamp for function calls Script size: 530 lines Stack pointer behaviour of a recursive program 12
Memory visualization Heatmap shows high level trends in memory access Memory divided into smaller blocks Color represent access intensity Access History Records detailed memory access information Identify specific memory patterns to tune code and optimize for example cache behavior Script Size: 240 lines (heatmap), 80 lines (access history) 13
Simulation Efficiency: Monitors and SCV Enable SCV logging during run-time No instrumentation 20 Xilinx monitors SCV trace : 2 streams, 86Mb, 1 250 000 events Monitors : 20 monitors, >10 000 000 events 1 SCV stream 2 SCV streams 25ms simulation of Microblaze system running uclinux 14
Summary Powerful monitoring and visualization based on open source toolkits Highly flexible monitor and accessor modules Tcl/Tk lets you roll your own, very quickly! Programming model for performance monitoring Further investigation required Embedded monitor modules cost effective compared to SCV transaction streaming 15
Thank you! 16