JVM Tool Interface Michal Pokorný
JVM TI Inspect & control execution on JVM (profiling, debugging, monitoring, thread analysis, coverage, ) Higher-level interface: Java Platform Debugger Architecture JVM TI clients are called agents Receives events, controls using functions (in event callbacks or otherwise) Run in same process, communicate directly with JVM Often controlled by separate process Dynamic loaded library (.so/.dll) or statically linked within JVM Started on JVM startup (-agentlib/-agentpath/java_tool_options), or attached at runtime
Architecture Agents run within JVM process; usually C or C++ Phases: ONLOAD, PRIMORDIAL, START, LIVE, DEAD Agent starts in very early JVM phase, or is attached when JVM is live If started early: Agent_OnLoad/OnUnload (optional), JVM terminated if agent setup fails No bytecodes executed, objects created, classes loaded If attached to live JVM: Agent_OnAttach, JVM continues running if agent setup fails Shared objects (.so/.dll/ ) or statically linked into JVM To avoid name clashes if statically linked: Agent_OnLoad_<agent-library_name>, Data types: jobject, jclass, jmethodid, jfieldid,...
Environments and capabilities Each agent has separate environment (jvmtienv*) Obtained/disposed by: jint GetEnv(JavaVM *vm, void **env, jint version);, DisposeEnvironment [similar to JNI] Isolates agents - event callbacks, object tags, etc.; JVM state is shared, through. Environment Local Storage (opaque void* pointer) Environment shared between Java threads Each environment has a set of capabilities Starts with no capabilities, capabilities must be explicitly added/relinquished Adding capabilities may incur costs Using capabilities may incur further costs Depending on JVM, the set of potential capabilities may change over time Examples: capabilities limited to 1 agent at any time, to certain phases of execution, Imagine a JVM implementation that: normally compiles into native code, but can switch to bytecode interpretation, but not at runtime. Single-stepping capability may only be
Events Callback table, installed in Agent_OnLoad / Agent_OnAttach Events are not queued => need careful writing of event handlers Events may trigger other events Exceptions -- when calling JNI methods that may throw, we must preserve any current pending exception Event callbacks must be reentrant (or protected by monitors) Extra rules for ordering co-located events VM Initialization, Start, Death VMInit callback called => VM initialized, agent can complete initialization
Bytecode instrumentation Object allocated, method enter/exit, would be costly to signal by events Preferred method: bytecode instrumentation (e.g. wrap every allocation) Benefits: it s JITed and optimized for speed, can be conditional Static instrumentation: change.class s on disk before loading into JVM; extremely awkward [sic] Load-time: raw.class transformed by agent in ClassFileLoadHook event Dynamic instrumentation: agent can request resending ClassFileLoadHook events (RetransformClasses) Fix and continue debugging (RedefineClasses) Exception: some allocations are not detectable by instrumentation E.g.: reflection, methods without bytecode. Those allocations fire VM Object Allocation events.
Observing execution 1/2 Breakpoints: <=1 breakpoint per instruction, generates Breakpoint event Watched fields: <=1 access watch, <=1 modification watch per field Field specified by jclass, jfieldid -- watch applies to all instances Fires FieldAccess, FieldModification event when about to access, modify Notify Frame Pop Ask JVM to tell agent when thread X s frame Y is popped (NotifyFramePop) Generates FramePop event Step Over function Single Step Fire SingleStep event whenever new location reached (except native code). Very expensive. Exception, ExceptionCatch
Observing execution 2/2 Monitors Monitor Contended Enter (attempting to enter locked monitor), Entered (when entered) Monitor Wait (about to wait), Waited (finished waiting) GC start, finish Only fired for stop-the-world collections Some GC is online, and some JVMs might not even have stop-the-world GC Thread Start, End
Heap Get/Set Tag (jlong tag of object, local to env), Get Objects With Tags Object Free event -- only sent for tagged objects Iterating through heap stalls all heap operations Filters: only (un)tagged objects/classes; only a particular jclass; callback that returns visit / don t-visit flag Follow References Reports every reference exactly once; BFS Starts: A) from all heap roots, B) from initial_object Iterate Through Heap Both reachable and unreachable objects Early abort Force Garbage Collection
Adding instrumentation bytecode Java is easier than C. Let s write instrumentation in Java. Add To Bootstrap Class Loader Search Add To System Class Loader Search In the live phase the system class loader supports adding a JAR file to be searched if the system class loader implements a method name appendtoclasspathforinstrumentation which takes a single parameter of type java.lang.string. The method is not required to have public access.
Instrumenting native methods Need to instrument: native boolean foo(int x); Rewrite: boolean foo(int x) { /* instrument */ return wrapped_foo(x); } native boolean wrapped_foo(x); Native method lookup for foo : 1) Look for native method foo 2) If it fails and foo starts with $NativeMethodPrefix, try stripping it off and looking again. lookup(wrapped_foo) -> fail -> lookup(foo) -> OK SetNativeMethodPrefix( wrapped_ )
Manipulating execution Get/set local variables, fields Threads, thread groups (hierarchy) Stack frames Force Early Return Location in code identified by jmethodid and jlocation Translation between jlocation and Java line number Local variables get/set; identified by frame depth and slot number Get Local Instance (get this object, also works for native frames)
More tricks Estimate size of object (platform-specific) Get signatures, modifiers of methods, classes, fields Methods: Local variable table, line number table Redefine Classes For fix-and-continue debugging Method has same bytecode, equal referenced constants in constant pool => equal Not equal => old methods, possibly running on some thread, is obsolete JNI method interception (table) Extension mechanism: JVM may provide additional JVMTI methods, events
The end
BACKUP SLIDES
Objs identified as JNI refs (jobject, jclass), derivatives (jthread, jthreadgroup, ) References passed are global/local, but always strong All returned references are local; need to manage this resource; guaranteed to be able to create 16 local refs; automatically deallocated before returning from native code Need to return to global reference before returning JNI: PushLocalFrame, PopLocalFrame Not on jmethodid, jfieldid (they aren t jobjects)
More events Method Entry, Exit (expensive, instrumentation recommended instead) Native Method Bind Class Load, Prepare Compiled Method Load, Unload (single method may have multiple forms) Data Dump Request (user-requested by keyboard shortcut) Dynamic Code Generated (compiled native code - e.g. interpreter depending on command-line) Resource Exhausted (e.g. threads, heap, OOM)
Method Line number table, local variable table Get Method Name (and Signature) - get method name, signature, generic signature Get Declaring Class, Modifiers Get Max Locals (includes locals for parameters) Get Arguments Size (# of local variable slots -- two-word args use 2) Get Line Number Table (if included in class file) Get Method Location (in bytecode) Get Local Variable Table (valid bytecode section, name+sgt+gensgt, slot) Get Bytecodes Is Method Native; Synthetic; Obsolete Set Native Method Prefix(es) -- works for entire environment This function modifies the failure handling of native method resolution by allowing retry with a prefix applied to the name. When used with the ClassFileLoadHook event, it enables native methods to be instrumented.
Class Get Loaded Classes, Get Classloader Classes (loaded by) Get Signature ( Ljava/util/List, int[] => [I, java.lang.integer.type => I ) Get Status, Get Source File Name, Modifiers (public/prot./priv.), Methods (both constr./stat.init. & true methods; not inherited methods), Fields, Implemented Interfaces, Class Version Numbers Get Constant Pool, Get Class Loader Is Interface, Is Array Class, Is Modifiable Class Get Source Debug Extension (TODO:???) Retransform Classes -- bytecode instr. of already loaded classes Redefine Classes
Raw Monitor Create, Destroy Enter, Exit, Wait, Notify, Notify All JNI interception (get/set JNI method table) Events Some events enabled/disabled per-thread (impossible: VMInit,Start,Death, ThreadStart, CompiledMethodLoad,Unload, DynamicCodeGenerated, DataDumpRequest) Must explicitly enable all; needs capabilities Generate (missed events, when attaching): CompiledMethodLoad, DynamicCodeGenerated
Timers Get Current Thread CPU Timer Information Get Current Thread CPU Time Get Thread CPU Timer Information Get Thread CPU Time Get Timer Information Get Time Get Available Processors (interestingly, this may change at runtime)
System Properties Get System Properties, Get/Set System Property Strongly recommended property keys : java.vm.vendor, java.vm.version, java.vm.name, java.vm.info, java.library.path, java.class.path Since this is a VM view of system properties, the set of available properties & their values will usually be different than that in java.lang.system. getproperties.
General Get Version Number (of JVM TI / JNI), Get Error Name Set Verbose Flag (-verbose:gc, -verbose:class (loading), -verbose:jni, other) Get JLocation Format jlocation is intentionally unconstrained. JVMBCI => index in GetBytecodes() / MACHINEPC: native machine program counter values / other
Pertubing dependencies: public Object() { MyProfiler.allocationTracker(this); } First created object causes load of MyProfiler, which causes new Object(), which causes load of MyProfiler (since it s not finished yet), recursion. if (trackallocations).
Stack frames Get Stack Trace, All Stack Traces, Thread List Stack Traces, Frame Count Pop Frame: return to state before calling current function Get Frame Location jvmtiframeinfo - Stack frame information structure: jmethodid + jlocation (instruction index / -1 for native); GetLineNumberTable jvmtistackinfo - Stack information structure, list of jvmtiframeinfo
Object Get size (platform-specific approximation), hash code, monitor usage Field Get declaring class, get modifiers, is synthetic? (no source code correspondence)
Get all stack traces #include <jvmti.h> jvmtistackinfo *stack_info; jint thread_count; int ti; jvmtierror err; err = (*jvmti)->getallstacktraces(jvmti, MAX_FRAMES, &stack_info, &thread_count); if (err!= JVMTI_ERROR_NONE) {... } for (ti = 0; ti < thread_count; ++ti) { jvmtistackinfo *infop = &stack_info[ti]; jthread thread = infop->thread; jint state = infop->state; jvmtiframeinfo *frames = infop->frame_buffer; int fi; mythreadandstateprinter(thread, state); for (fi = 0; fi < infop->frame_count; fi++) { myframeprinter(frames[fi].method, frames[fi].location); } } /* this one Deallocate call frees all data allocated by GetAllStackTraces */
Get stack trace of thread jvmtiframeinfo frames[5]; jint count; jvmtierror err; err = (*jvmti)->getstacktrace(jvmti, athread, 0, 5, frames, &count); if (err == JVMTI_ERROR_NONE && count >= 1) { char *methodname; err = (*jvmti)->getmethodname(jvmti, frames[0].method, &methodname, NULL, NULL); if (err == JVMTI_ERROR_NONE) { printf("executing method: %s", methodname); } }
Get loaded classes jvmtienv *jvmti; (*jvm)->getenv(jvm, &jvmti, JVMTI_VERSION_1_0);... jvmtierror err = (*jvmti)->getloadedclasses(jvmti, &class_count, &classes); (in C++: jvmti->getloadedclasses(&class_count, &classes);) // TODO: DisposeEnvironment // TODO: jvmtierror == JVMTI_ERROR_NONE
Memory management jvmtierror Allocate(jvmtiEnv* env, jlong size, unsigned char** mem_ptr) (heap iteration, GC events, ObjectFree OK) jvmtierror Deallocate(jvmtiEnv* env, unsigned char* mem)