JVM Class Relinker v1.0

Virtually every Java class needs references to other classes to achieve something meaningful. The example below, despite not so meaningful, will suffice to show them in action:

The javac compiler will take the Java source from these classes and turn it into four separate binary files called classfiles. Each of them includes the JVM executable instructions (bytecode), the data upon which they operate and linkage information that indicates references to external classes by their name. This type of references are commonly known as symbolic.


Main classfile – In blue background what appears to be a symbolic reference to class B.
Classfiles contain an area called Constants Pool where references live. There can be different types of references but our focus will be class ones. In addition to a type -which is class-, every reference has an index to the Pool where the symbol name (as a UTF-8 string) is located. We can visualize this running the javap -v command over the Main classfile:
Main class Constants Pool
Position #7 of the Pool contains a class reference, and when navigating to the symbol name (position #8) we see how this reference is to class B.
Once a reference is in the Constants Pool, it can be pointed-out multiple times by its position. See, for example, how the new bytecode instruction in the Main::main method uses a reference to class B:
Main bytecode instructions.
Well-specified references are not enough. For a classfile to be loaded into the JVM, the instruction set needs to be valid, the stack must be balanced (for each instruction that pushes, one has to consume), type assignments must be legal, and the list goes on. javac will always output valid classfiles. Thus, and going back to the example above, it will ask questions such as…

  • does class A, referenced from B, exist and is it reachable?
  • is class A non-final so B can extend it?
  • does B implement I so its new instance in Main::main is assignable to the local variable i?
  • does I::m throw any checked exception that is not handled or declared to be thrown in Main::main?
  • etc.

The first observation I want to make is that this is not about security. Classfiles can be generated by non-javac compilers, tampered with instrumentation tools or even entirely written by hand with an hexadecimal editor. No matter how they are generated, all of them can be passed to the JVM. The final judgement on whether a classfile is valid or not will -and must- take place at load time in the JVM.

If you are curious about classfile verification in the JVM, I’d suggest looking at the ClassVerifier::verify_class method. As an example, we can pick one of the type checks. Bytecode instructions are walked-through and for each one that pushes a value to the stack, its type is tracked in a shadow stack -such as here-. Note how statically-typed languages allow this information to be known at this point. When an instruction consumes from the stack, a type is retrieved from the shadow stack -such as here-. The obtained type must match the instruction expected one -such as this-. Actual values do not matter and are unknown until code is run: this is a symbolic pre-execution.

With all this introduction about references and classfile verification, let’s have some fun! We will mess with references and push verification to its limits.

JVM Class Relinker is a security testing tool to invalidate classfiles linkage checks between compilation and load times. In other words, to enable linking against a class during compilation but loading a different ‘version’ of the same class at run time. Under these circumstances, all javac previous checks become obsolete and JVM checks at load time turn essential.

To better illustrate, class A was not final at compile time in our study case but what if class A’(*) at load time is final?

It is out of scope to discuss the implications that extending a final class would have, but believe me that nothing good from a security standpoint can come out of that. Quick spoiler, so Java devs sleep tonight:

What we need to try this tricky input on the JVM is four classfiles: I, A’, B and Main. As previously discussed, javac will refuse to compile these classes together because B extending A’ would break the rules. Let’s create two groups of classes for compilation purposes only: I, A, B, Main on the one group; and I, A’ on the other. No rule is broken within each group, and 6 classfiles are generated at the end -from which I’s classfiles are duplicate-. We finally pick I, A’, B and Main classfiles and pass them to the JVM for execution.

javac compilation and JVM load groups
Too much hand work, right?

JVM Class Relinker takes this concept but adds some twists. All classes will be compiled at once -as part of the same compilation group- but:

  1. non-colliding names will be used for A (A representing the final class and AComp the non-final one); and
  2. B will extend class AComp.

JVM Class Relinker compilation group

There won’t be compilation issues but this is not what we want: B is not extending the final class A. So here it comes the magic: class B will be re-written on-the-fly (at load time) with all references to AComp replaced by references to A. Let’s break down this idea into parts:

At load time…

JVM Class Relinker Class Loader in its hierarchy
By means of a custom Class Loader, the JVM allows control of classes definition up to the point in which you decide its bytes. The JVM Class Relinker Class Loader inserts itself into a delegation hierarchy and is capable of either defining application classes (i.e. I, A, AComp, etc.) or passing the responsibility to the Platform Class Loader above. When defining application classes, it gets the classfile bytes from the file-system, looks for references replacements -if any-, and defines the class in the JVM.

…re-written…

When a reference replacement is needed, the JVM Class Relinker will fully re-write the class. A raw approach such as modifying the Constants Pool symbolic name would be possible, but the lack of APIs to parse this data and the consistency implications of a change (i.e.: modifying the length of a symbolic name may require sizes or offsets elsewhere to be adjusted) made me rule it out.

JVM Class Relinker uses ASM for class re-writing. This library is capable of going through every byte in a classfile (representing class metadata, methods, instructions, etc.) and let us decide how to write it down on a new classfile output. For those interested in software design patterns, this works as a typical Visitor pattern in which we extend a Reader (for callbacks) and call a Writer (for output). The Constants Pool for the re-written class is generated automatically by the library as needed.

This is how a super-class reference replacement looks like:

This method will be called when reading class B. The superName parameter will contain the ‘AComp’ value. However, it will end-up passing ‘A’ to the Writer.

all references replaced.

Constants Pool in class B before and after reference replacements.
The super-class example above is just one of the possible references replacements needed. We need to check field accesses, method and constructor invocations, new object instances, new arrays, cast instructions and so on.
Bytecode instructions in class B before and after referece replacements.
To wrap-up, and so you can start playing with the tool, a few implementation details:

  • Classes subject to references replacements are those internal to class Main
  • A special annotation type can be used to specify replacements on a class (syntax: original-class->replaced-class[,...,original-class-n->replaced-class-n])
  • The method called for execution (with or without replacements) is Main::execute (write your testing code there)
  • Invoke JVMClassRelinker.execute() for action (execution goes to Main::execute after replacements)

Output:

Bonus track

We said that Main::execute gets called with class reference replacements applied. What’s wrong with that?

The Main::execute method was compiled by javac and loaded into the JVM by the Application Class Loader. Thus, any reference there points to a non-replaced class after load time linking. In example, the class B in new B() is the one extending from AComp. How can the JVM Class Relinker invoke that method and expect its own replaced classes there? In reality, the JVM Class Relinker generates a shadow Main class with a copy of the Main::execute method. When loading the shadow Main, linkage will be against classes loaded by himself. This is for syntax convenience only, so we don’t need reflection or handle APIs to invoke methods or access fields.

If you find any security issue in OpenJDK, please report it according to this procedure.

Code is under a GPLv3 license.

Download JVM Class Relinker (v1.0)

(*) A’ is a conceptual name only: the actual class name is A and collides with the compile-time A. Thus, we have two ‘versions’ of A.

Leave a Reply

Your email address will not be published.