Example 3
This article is a continuation of Part #2. The scenario to discuss now is similar to the previous with the only difference being class A2
that is loaded:
1 2 3 4 5 6 7 8 |
public static void j2(I i) throws Throwable { i.m(); } public static void main(String[] args) throws Throwable { new A2(); j2(new A1()); } |
With class A2
in existence, the virtual callsite in Main::j2
can receive an instance of it anytime. The expectation is, thus, that no static binding is applied.
This is how C1’s Main::j2
assembly looks like at the I::m
callsite:
1 2 3 4 5 6 7 8 9 10 |
[Verified Entry Point] # {method} {0x00007f5da08002d0} 'j2' '(LI;)V' in 'Main' 0x7f5db4e0b7c0: mov %eax,-0x16000(%rsp) ... 0x7f5db4e0b7cd: movabs $0xffffffffffffffff,%rax 0x7f5db4e0b7d7: callq 0x7f5db4eabf20 ; ImmutableOopMap {} ; *invokeinterface m ; {virtual_call} ... 0x7f5db4e0b7ee: retq |
The assumption seems right: there is no static binding or inlining of the called method. However, we see a static call to the JVM instead of a virtual one (itable-based) to a method. Let’s set a breakpoint there and find out what is going on.
Once the breakpoint is hit, we step into. The callee is a trampoline that saves the native execution context (register values) and calls SharedRuntime::resolve_virtual_call_C
:
1 2 3 4 5 6 7 8 9 10 |
0x7f5db4eabf20: push %rbp 0x7f5db4eabf21: mov %rsp,%rbp 0x7f5db4eabf24: pushfq 0x7f5db4eabf25: sub $0x8,%rsp 0x7f5db4eabf29: sub $0x80,%rsp 0x7f5db4eabf30: mov %rax,0x78(%rsp) 0x7f5db4eabf35: mov %rcx,0x70(%rsp) ... 0x7f5db4eabf96: callq 0x7ffff6cdd4fa <SharedRuntime::resolve_virtual_call_C(JavaThread*)> 0x7f5db4eabf9b: movq $0x0,0x2d8(%r15) |
Turns out that the callsite is not linked to a method yet, and a resolution of the symbolic information in the classfile is required. This is how the I::m
reference looks like in the classfile’s Constant Pool:
With some formatting:
In run time, this information is available as a table in a ConstantPool
object, after its regular fields. We can look at the entries 7 to 10:
1 2 3 4 5 6 7 8 9 10 11 |
(gdb) set $cp = current->_callee_target->_constMethod->_constants (gdb) x/1xg (intptr_t*)((((char*)$cp)+sizeof(ConstantPool)))+7 0x7fffbbc000b0: 0x0000000000090008 (gdb) x/1xg (intptr_t*)((((char*)$cp)+sizeof(ConstantPool)))+8 0x7fffbbc000b8: 0x00000000000a0001 (gdb) x/1xg (intptr_t*)((((char*)$cp)+sizeof(ConstantPool)))+9 0x7fffbbc000c0: 0x000000000006000b (gdb) x/1xg (intptr_t*)((((char*)$cp)+sizeof(ConstantPool)))+10 0x7fffbbc000c8: 0x00000008004b22f0 (gdb) x/s ((Symbol*)(*((intptr_t*)((((char*)$cp)+sizeof(ConstantPool)))+10)))->_body 0x8004b22f6: "I" |
There are some differences between the information in the classfile and in run time. In the previous example we see how the run time entry for InterfaceMethodref
has offsets to the method holder and name-type but not the 0x0B
tag, and it is 8 bytes long. Entries are always 8 bytes long and there is an auxiliary _tags
array in the ConstantPool
instance holding the entry types. A given index used in either the table or the _tags
array refers to the same entry. Entries that are UTF-8 strings are represented with a pointer to a Symbol
object.
I won’t describe every difference but want to make a comment on class constant entries. As seen in the previous example, entry #8 for class I
has the UTF-8 reference 0x000a
but also a 0x0001
value in its lower bytes. This value is an index into the ConstantPool's
_resolved_klasses
auxiliary array, which contains Klass*
for resolved classes. The C++ class that describes class constant entries in the run time ConstantPool
is CPKlassSlot
. If we look at its corresponding tag in the _tags
array, the value is JVM_CONSTANT_UnresolvedClass
(0x64
) while the class is unresolved (instead of JVM_CONSTANT_Class
as in the classfile’s Constant Pool), and it’s updated after resolution.
InterfaceMethodref
entries have to be resolved in the same way than classes. The same is true for other field, method and handle references. Instead of an auxiliary array such as _resolved_klasses
, a structure called ConstantPoolCache
is used to hold the information. The reason for having this structure is avoid entering into the runtime and improving performance: this is frequently accessed and we have seen already how entering to the runtime requires saving the whole context. The ConstantPoolCache
associated to the ConstantPool
in our example has a table (after its regular fields) with 5 entries:
1 2 3 4 5 6 7 8 9 10 11 |
(gdb) set $cpc = $cp->_cache (gdb) print *((ConstantPoolCacheEntry*)(((char*)$cpc)+sizeof(ConstantPoolCache))+0) $37 = {_indices = 1, _f1 = 0x0, _f2 = 0, _flags = 0} (gdb) print *((ConstantPoolCacheEntry*)(((char*)$cpc)+sizeof(ConstantPoolCache))+1) $38 = {_indices = 7, _f1 = 0x0, _f2 = 0, _flags = 0} (gdb) print *((ConstantPoolCacheEntry*)(((char*)$cpc)+sizeof(ConstantPoolCache))+2) $39 = {_indices = 11993102, _f1 = 0x7fffbbc00970, _f2 = 0, _flags = -1879048191} (gdb) print *((ConstantPoolCacheEntry*)(((char*)$cpc)+sizeof(ConstantPoolCache))+3) $40 = {_indices = 11993105, _f1 = 0x7fffbbc00dd0, _f2 = 0, _flags = -1879048191} (gdb) print *((ConstantPoolCacheEntry*)(((char*)$cpc)+sizeof(ConstantPoolCache))+4) $41 = {_indices = 12058642, _f1 = 0x7fffbbc002d0, _f2 = 0, _flags = -1874853887} |
The _indices
field has a reference to the corresponding ConstantPool
entry in its lower 16 bits. Thus, entries 1, 2, 3, 4 and 5 above point to entries 1 (Object::<init> Methodref
), 7 (I::m InterfaceMethodref
), 14 (A1::<init> Methodref
), 17 (A2::<init> Methodref
) and 18 (Main::j2 Methodref
) in the ConstantPool
.
How does this work? The bytecode instructions in a classfile might contain references to the ConstantPool
. This is how the I::m
invocation in particular looks like:
With some formatting:
The reference in the bytecode instruction, as read in the file, is to the ConstantPool
entry. However, we’ve just said that the information in run time is held in a ConstantPoolCache
entry, which has different indexes. Turns out that bytecode instructions were patched when loaded into memory, so references to the ConstantPool
were replaced with references to the ConstantPoolCache
if applies. This is how Main::j2
bytecodes, including the invocation to I::m
, look like in memory:
1 2 |
(gdb) x/7xb (char*)(current->_callee_target->_constMethod)+sizeof(ConstMethod) 0x7fffbbc002c0: 0x2a 0xb9 0x01 0x00 0x01 0x00 0xb1 |
Notice how the 0x0007
original reference was changed to 0x0001
(endianness aside), while the instructions aload_0
(0x2a
), invokeinterface
(0xb9
) and return
(0xb1
) remained the same.
The JVM does lazy callsite resolution, for faster initialization time and avoid paying the cost if execution never reach them. As seen above, a pointer to a resolution stub is left in place if a compiled method has unresolved callsites. We should expect a full method resolution now, the ConstantPoolCache
entry for I::m
to be updated perhaps and the callsite in Main::j2
‘s assembly to be patched.
I’ll leave the callsite resolution for a future article, but let’s see how C1’s Main::j2
looks like just after completion:
1 2 3 4 5 6 |
0x7f5db4e0b7c0: mov %eax,-0x16000(%rsp) ... 0x7f5db4e0b7cd: movabs $0x801000400,%rax 0x7f5db4e0b7d7: callq 0x7fffe5601300 ... 0x7f5db4e0b7ee: retq |
There are two things to notice: 1) value 0x801000400
is loaded into rax
(instead of 0xffffffffffffffff
) and, 2) there is a call to 0x7fffe5601300
.
1st observation:
1 2 |
(gdb) x/s ((InstanceKlass*)0x801000400)->_name->_body 0x7ffff0184ee6: "A1" |
The fact that there is a reference to the A1
class means that the callsite resolution was not purely based on symbolic linking information (which would have lead to find m
‘s position in H
‘s itable) but considered the current receiver’s type.
2nd observation:
1 2 3 4 5 6 7 8 |
0x7fffe5601300: mov 0x8(%rsi),%r10d 0x7fffe5601304: movabs $0x800000000,%r11 0x7fffe560130e: add %r11,%r10 0x7fffe5601311: cmp %rax,%r10 0x7fffe5601314: jne 0x7fffe56a1920 ... 0x7fffe5601320: mov %eax,-0x16000(%rsp) ... |
Based on the x86-64 calling convention, we know that the receiver object is in the rsi
register. The code is reading its header and getting the class reference, which is then decompressed to a InstanceKlass*
. If the receiver’s class is A1
, then the method (which we presume to be A1::m
) is executed. Otherwise, there is a jump to 0x7fffe56a1920
. The important remark here is that the j2
method is considering the possibility of a receiver object at the I::m
callsite whose type is not A1
. Thus, we have evidence that there isn’t a static binding.
0x7fffe56a1920
goes to a SharedRuntime::handle_wrong_method_ic_miss(JavaThread*)
runtime call. What we have seen is an optimization called inline cache. The callsite is optimistically linked to a concrete method but with safeguards in place: the rax
register carries the receiver’s type that would make the chosen method to be right but that is validated against the actual receiver passed in run time. Notice how the rax
register is used not to mess with the method arguments, held in the registers indicated by the calling convention. Methods have an unverified entry point which performs this verification before the real (verified) one. This is faster than going through the ConstantPoolCache
entry and the itable
.
Finally, let’s look at the CHA decision in GraphBuilder::invoke. These are some initial context values when called:
Bytecodes::Code code
:_invokeinterface
ciMethod* target
:H::m
ciKlass* holder
:I
ciInstanceKlass* klass
:H
ciInstanceKlass* calling_klass
:Main
ciInstanceKlass* callee_holder
:I
ciInstanceKlass* actual_recv
:I
The values above are exactly the same than in Example 1 and Example 2. What happens next is identical to Example 2 until the call to declared_interface->unique_implementor(). This time around, I
‘s implementor field points to itself to indicate that there is not a single one:
1 2 3 4 |
(gdb) set $IClass = (InstanceKlass*)0x801000c28 (gdb) set $IImplementor = (*((InstanceKlass**)((char*)$IClass+sizeof(InstanceKlass)+($IClass->_vtable_len*sizeof(void*))+($IClass->_itable_len*sizeof(void*))+($IClass->_nonstatic_oop_map_size*sizeof(void*))))) (gdb) print/x $IImplementor $1 = 0x801000c28 |
Thus, singleton
is null
here, cha_monomorphic_target
is null
as well and the conditions are not met for static binding.
Full series of related articles: