Class Hierarchical Analysis (CHA) examples in C1 (Hotspot JVM) – Part #1

Virtual calls come at a significant performance cost in many occasions. It’s not only that memory accesses to vtables take CPU cycles and could pollute caches, but also how method-inlining savings are missed. Proper engineering practices require to use expensive resources only when needed. Even when programming languages offer syntactic hints for the developer to make a decision (i.e. virtual or final method declarations), it’s ultimately up to a good compiler to perform a thorough analysis and optimize.

In this series of short examples, we will see how the C1 just-in-time compiler in the Hotspot Java Virtual Machine (JVM) performs Class Hierarchical Analysis (CHA) and deals with virtual calls. For each case, we will discuss what should happen, observe what actually happened and elaborate an explanation.

All the examples are based on the following Java classes topology:

This is the base template code:

interface H {
    void m();
}

interface I extends H {
}

class A1 implements I {
    public void m() {
        System.out.println("A1::m");
    }
}

class A2 implements I {
    public void m() {
        System.out.println("A2::m");
    }
}

public final class Main {

    // JIT-compiled (C1).
    // Called from Main::main.
    // Will contain a virtual callsite to ::m in each example.
    public static void j<n>(...) throws Throwable {
        ...
    }

    public static void main(String[] args) throws Throwable {
        ...
    }

}

interface H {

void m();

}

interface I extends H {

}

class A1 implements I {

public void m() {

System.out.println("A1::m");

}

class A2 implements I {

public void m() {

System.out.println("A2::m");

}

public final class Main {

// JIT-compiled (C1).

// Called from Main::main.

// Will contain a virtual callsite to ::m in each example.

public static void j<n>(...) throws Throwable {

...

}

public static void main(String[] args) throws Throwable {

...

}

A few notes before moving on:

The template is compiled with

Shell

javac -sourcepath . -d . Main.java

1

javac -sourcepath . -d . Main.java

The template is run with

java -classpath . -Xcomp -XX:TieredStopAtLevel=1 -XX:CompileOnly=Main::j1,Main::j2,Main::j3,H,I,A1,A2 -XX:CompileCommand=print,Main::j1 -XX:CompileCommand=print,Main::j2 -XX:CompileCommand=print,Main::j3 Main

1	java -classpath . -Xcomp -XX:TieredStopAtLevel=1 -XX:CompileOnly=Main::j1,Main::j2,Main::j3,H,I,A1,A2 -XX:CompileCommand=print,Main::j1 -XX:CompileCommand=print,Main::j2 -XX:CompileCommand=print,Main::j3 Main

For each n case, we will instantiate Main::j<n> -for example, Main::j1– and Main::main methods
We refer to virtual calls in a broad sense that includes interface calls in Java
All this work is based on JDK-17 running on Fedora Linux x86_64.

Example 1

public static void j1() throws Throwable {
    I i = new A1();
    i.m();
}

public static void main(String[] args) throws Throwable {
    // Load A1. A2 is not loaded.
    new A1();
    j1();
}

public static void j1() throws Throwable {

I i = new A1();

i.m();

}

public static void main(String[] args) throws Throwable {

// Load A1. A2 is not loaded.

new A1();

j1();

}

Independently of class A2, the only possible target for the virtual callsite in j1 is A1::m. Even if class A1 is subclassed at any point of the execution with an additional implementation of m –A is not final and classes can be dynamically loaded in the JVM-, its instances cannot reach the virtual callsite. Thus, a static binding to the target method should be safe in this scenario.

At bytecodes level, this is how j1 looks like:

public static void j1() throws java.lang.Throwable;
    Code:
       0: new           #7 // class A1
       3: dup
       4: invokespecial #9 // Method A1."<init>":()V
       7: astore_0
       8: aload_0
       9: invokeinterface #10, 1 // InterfaceMethod I.m:()V
      14: return

public static void j1() throws java.lang.Throwable;

Code:

0: new #7 // class A1

3: dup

4: invokespecial #9 // Method A1."<init>":()V

7: astore_0

8: aload_0

9: invokeinterface #10, 1 // InterfaceMethod I.m:()V

14: return

The virtual call is the one at instruction #9: invokeinterface. We know a few things about the callsite target method: its name is m, its signature is ()V -so it receives no arguments and returns void- and it’s either held-in or inherited-to an interface with name I.

To better understand compiler decisions, it’s helpful to scrutinize some of the JVM internal structures that we have. Java classes and interfaces H, I and A1 are represented at the JVM level as instances of the C++ class InstanceKlass. Let’s start with the interface I and explore from there:

(gdb) set $IClass = (InstanceKlass*)0x8000c1210

(gdb) x/s $IClass->_name->_body
"I"

(gdb) print $IClass->_methods->_length
0

(gdb) print $IClass->_local_interfaces->_length
1

(gdb) x/s $IClass->_local_interfaces->_data[0]->_name->_body
"H"

(gdb) print $IClass->_local_interfaces->_data[0]->_methods->_length
1

(gdb) set $MMethod = $IClass->_local_interfaces->_data[0]->_methods->_data[0]

(gdb) x/s (**((Symbol**)((intptr_t*)(((char*)(((Method*)$MMethod)->_constMethod->_constants)) + sizeof(ConstantPool)) + (((Method*)$MMethod)->_constMethod->_name_index))))._body
"m"

(gdb) set $IClass = (InstanceKlass*)0x8000c1210

(gdb) x/s $IClass->_name->_body

"I"

(gdb) print $IClass->_methods->_length

(gdb) print $IClass->_local_interfaces->_length

(gdb) x/s $IClass->_local_interfaces->_data[0]->_name->_body

"H"

(gdb) print $IClass->_local_interfaces->_data[0]->_methods->_length

(gdb) set $MMethod = $IClass->_local_interfaces->_data[0]->_methods->_data[0]

(gdb) x/s (**((Symbol**)((intptr_t*)(((char*)(((Method*)$MMethod)->_constMethod->_constants)) + sizeof(ConstantPool)) + (((Method*)$MMethod)->_constMethod->_name_index))))._body

"m"

What we see is that I does not have any methods. If we look into interfaces of I, there is H which contains the method H::m. We have not looked into H::m flags but it’s obviously abstract because H does not provide an implementation. Two simple remarks: 1) I inherits but does not contain m -hence why its _methods array is empty-; and, 2) an abstract method is still represented with an entry in the _methods array.

Let’s look down the hierarchy now:

(gdb) set $IImplementor = (*((InstanceKlass**)((char*)$IClass+sizeof(InstanceKlass)+($IClass->_vtable_len*sizeof(void*))+($IClass->_itable_len*sizeof(void*))+($IClass->_nonstatic_oop_map_size*sizeof(void*)))))

(gdb) x/s $IImplementor->_name->_body
"A1"

(gdb) set $A1Class = $IImplementor

(gdb) print $A1Class->_methods->_length
2

(gdb) set $A1Method0 = $A1Class->_methods->_data[0]

(gdb) x/s (**((Symbol**)((intptr_t*)(((char*)(((Method*)$A1Method0)->_constMethod->_constants)) + sizeof(ConstantPool)) + (((Method*)$A1Method0)->_constMethod->_name_index))))._body  
"<init>"

(gdb) set $A1Method1 = $A1Class->_methods->_data[1]

(gdb) x/s (**((Symbol**)((intptr_t*)(((char*)(((Method*)$A1Method1)->_constMethod->_constants)) + sizeof(ConstantPool)) + (((Method*)$A1Method1)->_constMethod->_name_index))))._body  
"m"

(gdb) set $IImplementor = (*((InstanceKlass**)((char*)$IClass+sizeof(InstanceKlass)+($IClass->_vtable_len*sizeof(void*))+($IClass->_itable_len*sizeof(void*))+($IClass->_nonstatic_oop_map_size*sizeof(void*)))))

(gdb) x/s $IImplementor->_name->_body

"A1"

(gdb) set $A1Class = $IImplementor

(gdb) print $A1Class->_methods->_length

(gdb) set $A1Method0 = $A1Class->_methods->_data[0]

(gdb) x/s (**((Symbol**)((intptr_t*)(((char*)(((Method*)$A1Method0)->_constMethod->_constants)) + sizeof(ConstantPool)) + (((Method*)$A1Method0)->_constMethod->_name_index))))._body

"<init>"

(gdb) set $A1Method1 = $A1Class->_methods->_data[1]

(gdb) x/s (**((Symbol**)((intptr_t*)(((char*)(((Method*)$A1Method1)->_constMethod->_constants)) + sizeof(ConstantPool)) + (((Method*)$A1Method1)->_constMethod->_name_index))))._body

"m"

I has only one implementor: A1. This is true because A2 has not been loaded in this example. A1 has two methods: <init> (the constructor) and A1::m. Notice how H::m and A1::m are different methods, being the latter a concrete one.

What A1 has in its vtable, after the slots inherited from its superclass (java.lang.Object), is a pointer to A1::m:

(gdb) print (*(Method**)(((char*)$A1Class)+sizeof(InstanceKlass)+sizeof(void*)*$A1Class->_super->_vtable_len)) == $A1Method1
true

1 2	(gdb) print ((Method)(((char)$A1Class)+sizeof(InstanceKlass)+sizeof(void)$A1Class->_super->_vtable_len)) == $A1Method1 true

More interestingly, we can now look at A1 interface vtables:

(gdb)print $A1Class->_local_interfaces->_length
1

(gdb) print $A1Class->_local_interfaces->_data[0] == $IClass
true

(gdb) print $A1Class->_itable_len
7

(gdb) x/s (*(InstanceKlass**)(((char*)$A1Class)+sizeof(InstanceKlass)+$A1Class->_vtable_len*sizeof(void*)+sizeof(void*)*0))->_name->_body
"H"

(gdb) set $HVtableOffset = (*(int*)(((char*)$A1Class)+sizeof(InstanceKlass)+$A1Class->_vtable_len*sizeof(void*)+sizeof(void*)*1))

(gdb) x/s (*(InstanceKlass**)(((char*)$A1Class)+sizeof(InstanceKlass)+$A1Class->_vtable_len*sizeof(void*)+sizeof(void*)*2))->_name->_body
"I"

(gdb) set $IVtableOffset = (*(int*)(((char*)$A1Class)+sizeof(InstanceKlass)+$A1Class->_vtable_len*sizeof(void*)+sizeof(void*)*3))

(gdb)print $A1Class->_local_interfaces->_length

(gdb) print $A1Class->_local_interfaces->_data[0] == $IClass

true

(gdb) print $A1Class->_itable_len

(gdb) x/s (*(InstanceKlass**)(((char*)$A1Class)+sizeof(InstanceKlass)+$A1Class->_vtable_len*sizeof(void*)+sizeof(void*)*0))->_name->_body

"H"

(gdb) set $HVtableOffset = (*(int*)(((char*)$A1Class)+sizeof(InstanceKlass)+$A1Class->_vtable_len*sizeof(void*)+sizeof(void*)*1))

(gdb) x/s (*(InstanceKlass**)(((char*)$A1Class)+sizeof(InstanceKlass)+$A1Class->_vtable_len*sizeof(void*)+sizeof(void*)*2))->_name->_body

"I"

(gdb) set $IVtableOffset = (*(int*)(((char*)$A1Class)+sizeof(InstanceKlass)+$A1Class->_vtable_len*sizeof(void*)+sizeof(void*)*3))

A1 has two interface vtables: H and I. In the gdb variables $HVtableOffset and $IVtableOffset we have the offset to each of them, counting from the start of InstanceKlass. We expect the I interface vtable to be empty because I declares no methods. However, we can look at the first -and only- entry of the interface vtable that corresponds to H:

(gdb) print (*(Method**)(((char*)$A1Class)+$HVtableOffset)) == $A1Method1
true

1 2	(gdb) print ((Method)(((char)$A1Class)+$HVtableOffset)) == $A1Method1 true

In the interface vtable that corresponds to H, A1 has a pointer to A1::m.

In summary, when we pass an instance of A1 -in Object-Oriented terms, a receiver– to a callsite whose target is either H::m or I::m, the method A1::m will be obtained from A1 interface vtables and invoked. If the callsite has a A1::m target, the A1 vtable is used to obtain the A1::m reference.

Now let’s look back at the invokeinterface callsite with target I::m in Main::j1. If we could guarantee that the only possible receiver there is of type A1, then we know that the method to be invoked will be A1::m (obtained from A1 interface vtables).

At C1 level, this is how the virtual callsite in j1 looks like:

[Verified Entry Point]
# {method} {0x7fb9f88002b8} 'j1' '()V' in 'Main'
0x7fba05622c80: mov    %eax,-0x16000(%rsp)
...
0x7fba05622c8c: movabs $0x801000400,%rdx ; {metadata('A1')}
...
0x7fba05622cbc: mov    %rdx,%rcx
...
0x7fba05622cbf: movabs $0x800000000,%r10
0x7fba05622cc9: sub    %r10,%rcx
0x7fba05622ccc: mov    %ecx,0x8(%rax)
...
0x7fba05622cd8: jmpq   0x7fba05622db5
...
 ;; patch entry point
0x7fba05622db5: callq  0x7fba05628a00 ; ImmutableOopMap {}
                                      ; *getstatic out
                                      ; {runtime_call
                                      ; load_mirror_patching}
...

[Verified Entry Point]

# {method} {0x7fb9f88002b8} 'j1' '()V' in 'Main'

0x7fba05622c80: mov %eax,-0x16000(%rsp)

...

0x7fba05622c8c: movabs $0x801000400,%rdx ; {metadata('A1')}

...

0x7fba05622cbc: mov %rdx,%rcx

...

0x7fba05622cbf: movabs $0x800000000,%r10

0x7fba05622cc9: sub %r10,%rcx

0x7fba05622ccc: mov %ecx,0x8(%rax)

...

0x7fba05622cd8: jmpq 0x7fba05622db5

...

;; patch entry point

0x7fba05622db5: callq 0x7fba05628a00 ; ImmutableOopMap {}

; *getstatic out

; {runtime_call

; load_mirror_patching}

...

In the first instructions we see how an instance of A1 is created: %rax points to the new instance and, at offset 0x8 (object header), a compressed pointer to InstanceKlass A1 is written (0x801000400 - 0x800000000 = 0x1000400). The last instruction listed is a trampoline to call a patcher in run time. After a couple of patcher calls, the code to obtain System.out will be written starting at 0x7fba05622cd8. The important thing for us is that this is part of A1::m. As predicted, the virtual callsite was statically bound to A1::m and then inlined.

C1 decided that the callsite was monomorphic and could, thus, statically bind it to a method in GraphBuilder::invoke. These are some initial context values when called:

Bytecodes::Code code: _invokeinterface
ciMethod* target: H::m
ciKlass* holder: I
ciInstanceKlass* klass: H
ciInstanceKlass* calling_klass: Main
ciInstanceKlass* callee_holder: I
ciInstanceKlass* actual_recv: I

The first action is trying to determine the exact type of the receiver here. The JVM keeps track of the instructions that pushed values to the stack, so it’s possible to trace how the receiver got there. This bring us to the new A1() instruction, represented by a C++ NewInstance class instance (subclass of Instruction):

(gdb) x/s ((NewInstance*)0x7fffa4033510)->_klass->_name->_symbol->_body
"A1"

1 2	(gdb) x/s ((NewInstance*)0x7fffa4033510)->_klass->_name->_symbol->_body "A1"

The class NewInstance overloads the exact_type method to return the class. The variable type points to A1 and it’s an exact type. receiver_klass is set to A1 and the exact target method is located calling ciMethod::resolve_invoke here with the following parameters: target = H::m, calling_klass = Main and receiver_klass = A1. I won’t dive into ciMethod::resolve_invoke details but the aforementioned JVM structures should give an idea of why the A1::m is returned (a method with name m and signature ()V is found in A1 receiver_klass _methods array). Once we know that static binding is possible, code is updated to a Bytecodes::_invokespecial value and no more virtual call onwards.

Full series of related articles:

Part #1
Part #2
Part #3

Example 1

Leave a Reply Cancel reply