Quarkslab folks proposed a challenge: modify any compiler to generate code with some obfuscation but valid semantics. In other words, add some noise but keep the program logic valid.
Time to have some fun with C1: OpenJDK’s 1st tier JIT compiler.
Obfuscation “noise” can be added at different stages of the compilation process. Adding it early may be useful to target multiple architectures but it’s necessary to check if following optimization passes revert the effect.
The basic flow for C1 is the following: Java bytecodes -> IR (Intermediate Representation) -> LIR (Low-level Intermediate Representation) -> assembly code (architecture specific). A few key functions involved in these transformations: GraphBuilder::iterate_bytecodes_for_block (for IR generation), LIRGenerator::block_do (for LIR generation) and LIR_Assembler::emit_lir_list (for assembly generation).
This is my basic pseudo-random Java test code -don’t even try to make sense of it!-:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
public class Main {
private static volatile int a;
private static String s;
public static void main(String[] args) {
System.out.println("Jit example started");
int max_iterations = 15000;
while(max_iterations-- > 0) {
f();
}
System.out.println("a (final): " + a);
System.out.println("s (final): " + s);
}
private static void f() {
int i = 100;
while (i-- > 0) {
a = a + 1234;
}
s = new Integer(a).toString();
}
}
|
Function Main::f is called many times in a loop so it will be compiled by C1 first.
This is Main::f in bytecodes:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
private static void f();
Code:
0: bipush 100
2: istore_0
3: iload_0
4: iinc 0, -1
7: ifle 23
10: getstatic #6 // Field a:I
13: sipush 1234
16: iadd
17: putstatic #6 // Field a:I
20: goto 3
23: new #10 // class java/lang/Integer
26: dup
27: getstatic #6 // Field a:I
30: invokespecial #11 // Method java/lang/Integer."<init>":(I)V
33: invokevirtual #12 // Method java/lang/Integer.toString:()Ljava/lang/String;
36: putstatic #8 // Field s:Ljava/lang/String;
39: return
|
My obfuscation noise will be before any addition operation. In this case, before the iadd instruction.
This will be my x86_64 innocuous noise:
1
2
3
4
5
6
7
8
9
10
|
PUSHFD
PUSH RAX
PUSH RDX
tryAgain:
RDTSC
CMP EAX, EDX
JE tryAgain
POP RDX
POP RAX
POPFD
|
I’m saving the EFLAGS, RAX and RDX registers first because their values will be destroyed -restoring original values is mandatory to keep the state integrity after the obfuscation code-. Then I get the timestamp with RDTSC and compare the lower and higher halves. They will likely be different. In the unrealistic case that they are equal, we try again. Then we restore original values and continue execution as if nothing happened.
In x86_64, this is Main::f JIT compiled method:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
0x7f03f570c0c5: cmp $0x0,%esi
0x7f03f570c0c8: je 0x7f03f570c281
0x7f03f570c0ce: mov $0x64,%edx
0x7f03f570c0d3: jmpq 0x7f03f570c12f
0x7f03f570c0d8: movabs $0x6d9a02658,%rdx
0x7f03f570c0e2: mov 0x74(%rdx),%esi
0x7f03f570c0e5: pushfq
0x7f03f570c0e6: push %rax
0x7f03f570c0e7: push %rdx
0x7f03f570c0e8: rdtsc
0x7f03f570c0ea: cmp %edx,%eax
0x7f03f570c0ec: je 0x7f03f570c0e8
0x7f03f570c0ee: pop %rdx
0x7f03f570c0ef: pop %rax
0x7f03f570c0f0: popfq
0x7f03f570c0f1: add $0x4d2,%esi
0x7f03f570c0f7: mov %esi,0x74(%rdx)
0x7f03f570c0fa: lock addl $0x0,-0x40(%rsp)
0x7f03f570c100: movabs $0x7f03d5cfcc60,%rdx
0x7f03f570c10a: mov 0x24(%rdx),%esi
0x7f03f570c10d: add $0x8,%esi
0x7f03f570c110: mov %esi,0x24(%rdx)
|
I then decided to add some unaligned jumps as an anti-disassembly technique. Main::f now looks a bit more obscure:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
0x7f4b64c52865: push %rax
0x7f4b64c52866: lea 0x5(%rip),%rax # 0x7f4b64c52872
0x7f4b64c5286d: jmpq *%rax
0x7f4b64c5286f: callq 0x7f4b66b0d32f
0x7f4b64c52874: callq 0x7f4b95d47b15
0x7f4b64c52879: jmp 0x7f4b64c5287c
0x7f4b64c5287b: callq 0x7f4b5c39eabb
0x7f4b64c52880: jmp 0x7f4b64c52883
0x7f4b64c52882: callq 0x7f4ae61dc5e1
0x7f4b64c52887: (bad)
0x7f4b64c52888: rolb %cl,(%rax,%rax,1)
0x7f4b64c5288b: add %cl,-0x7c0f8b8e(%rcx)
0x7f4b64c52891: rex.R and $0xc0,%al
0x7f4b64c52894: add %cl,-0x46(%rax)
0x7f4b64c52897: (bad)
0x7f4b64c52898: fmul %st(7),%st
0x7f4b64c5289a: xor 0x7f(%rbx),%ecx
|
Download patch here (based on JDK rev 89111a0e6355).