The High-Performance Java Tutorial at ICS'99 will take place on June 19, 1999 in conjunction with ICS'99. The day is dedicated to two tutorials on Java programming, compilation and system support. The tutorials will discuss several optimization techniques that improve the performance of Java programs.
Target audience: Java VM/JIT developers, application programmers. Assumed background: Java language, compilation techniques
Part 1: Java programming
length : 1 hour
Part 2: Java JIT-compilation
length : 3 hours
ABSTRACT
The Java model achieves security and platform independence by translating and distributing applications in Java bytecodes, the instructions of Java Virtual Machine. In the early releases of Java Development Kit, bytecodes were interpreted by the JVM. As Just-In-Time compilation technology has matured, JIT compilers have now become integral parts of JVM implementations. By compiling bytecodes into the host-machine instruction set and optimizing the code on the fly, the performance of Java applications is significantly improved, even matching the performance of statically compiled languages. Given that compilation takes place during the execution time of an application, JIT-compiler designers are faced with a major challenge of identifying cost effective optimizations and implementing them in an efficient manner.
In this tutorial, a team from Intel's Microcomputer Research Labs that has developed a Java JIT-compiler for the Intel Architecture presents their experiences and results. We start with an overview of the Java language and the components of a Java environment. We give detailed descriptions of Java constructs that have direct impact on compilation. Covered topics include classes and objects, interfaces, threads, exception handling, and garbage collection. Next, we give a comprehensive and in-depth presentation about the structure and optimizations of the state-of-the-art JIT-compiler that we have developed. This includes JVM-JIT and JIT-GC interactions, intermediate languages for translating bytecodes for Intel Architecture, optimization techniques at various levels, exception handling, and garbage collection. We further discuss a spectrum of high-level optimizations such as CSE, loop-invariant code-motion, elimination of range checks and cast checks, inlining, speculative inlining, method specialization, multiversion code, optimizations enabled due to strong typing, and hardware-assisted exception-handling. We also present the techniques and heuristics that we used for low-level optimizations such as local and global register allocation, instruction selection and scheduling, and code and data alignment. Moreover, we share our experiences with advanced vectorization techniques in support of Intel's MMX Technology, as well as a demonstration of VTune, Intel's performance analysis tool that assisted us in our optimizations design.
Target Audience: Java VM and compiler developers.
Assumed Background: Java programming language, basic compiler
technology
Length: 4 hours
Outline of the tutorial:
ABSTRACT:
In this tutorial, we describe the state of the art in optimized compilation of Java programs. In the first half of the tutorial, we discuss issues and techniques related to dynamic optimization of general Java programs. As motivation, we show a breakdown of execution time for some Java benchmark programs, and explain why these Java programs run slower than comparable C/C++ programs. There are several aspects of Java semantics that cause complications for traditional optimizations e.g., exceptions, garbage collection, threads, synchronization, and dynamic class loading. We describe aggressive models of these semantics features that allow more code motion and optimization to occur, compared to simpler models of these features. On the positive side, the semantics of pointers (object references) in Java is more retricted than the semantics of pointers in C/C++ programs. We describe algorithms for pointer-induced alias analysis for Java that are more precise and efficient than corresponding algorithms for C/C++, and we discuss two key optimizations for Java --- synchronization optimization and stack allocation of heap data --- that use the results of pointer-induced alias analysis. Finally, we briefly mention some speculative optimizations that are well-suited to dynamic compilation e.g., optimization of methods with respect to the (dynamic) extant class hierarchy, specialization of methods with respect to data values, and optimistic assumptions about exceptions. We also outline recovery techniques that guarantee correctness when speculative assumptions are violated.
In the second half of the tutorial, we discuss techniques for static compilation and optimization of scientific programs written in Java. The growing popularity of Java has led to a great deal of interest in using it for numerically intensive computing, even though it was not originally intended for applications in that domain. We describe the important factors that inhibit performance of scientific programs written in Java, like (i) precise exceptions and mandatory checks for null pointers and out-of-bounds array access violations, (ii) absence of true multidimensional arrays, (iii) lack of support for complex data type, and (iv) restrictions in floating point semantics. We first discuss the solutions that allow high performance, approaching the levels of Fortran and C/C++, to be delivered without compromising on the purity of Java. This is achieved through a combination of using class libraries and a standard array package, and compiler techniques such as creation of exception-free regions and semantic expansion of known method calls. Each of these solutions is discussed in depth, with supporting data based on performance measurements. We shall also discuss proposals that seek to modify elements of the floating point semantics of Java, in order to achieve higher performance.