Performance measurement of interpreted, JIT, and dynamically compiled executions

My research focuses on problems associated with the performance measurement of interpreted, just-in-time compiled, and dynamically compiled executions. There are two characteristics of these types of executions that make performance measurement difficult: (1) there is an interdependence between the virtual machine (VM) program and the application program; (2) the application program code may be translated at run-time by the VM (ex. dynamically compiled Java code is transformed from byte-code to native code and may be executed in both forms). My solution is a representational model for describing performance data from interpreted, just-in-time compiled, and dynamically compiled executions that solves problems associated with these two characteristics. The model allows for a concrete description of behaviors in interpreted, JIT compiled and dynamically compiled exectutions, and it is a reference point for what is needed to implement a performance tool for measuring these types of executions. An implementation of the model can answer performance questions about arbitrary interactions between the VM and the application, and it can represent performance data in a language that both an application developer and a VM developer can understand. The model describes performance data in terms of the different forms of an application program, describes run-time transformational costs associated with dynamically compiled code, and correlates performance data collected for different forms of application code.

To demonstrate the model, I built a prototype implementation of a performance tool for measuring interpreted and dynamically compiled Java executions called Paradyn-J. Paradyn-J dynamically inserts instrumentation into the Java VM and into Java application byte-code and native code as the Java application is run by the VM. Paradyn-J requires no modifications to the Java VM nor does it require changes to Java source or class files prior to execution. Paradyn-J represents data in terms of specific VM overhead and other costs associated with the interpreted execution of individual application method byte-codes, run-time transformational costs associated with dynamically compiling Java method byte-codes, direct execution costs of the resulting native code versions of run-time compiled byte-codes, and costs of residual dependencies of the native code form the application on the VM (e.g. object creation overhead).