Quarkslab folks proposed a challenge: modify any compiler to generate code with some obfuscation but valid semantics. In other words, add some noise but keep the program logic valid.
Time to have some fun with C1: OpenJDK’s 1st tier JIT compiler.
Obfuscation “noise” can be added at different stages of the compilation process. Adding it early may be useful to target multiple architectures but it’s necessary to check if following optimization passes revert the effect.
The basic flow for C1 is the following: Java bytecodes -> IR (Intermediate Representation) -> LIR (Low-level Intermediate Representation) -> assembly code (architecture specific). A few key functions involved in these transformations: GraphBuilder::iterate_bytecodes_for_block (for IR generation), LIRGenerator::block_do (for LIR generation) and LIR_Assembler::emit_lir_list (for assembly generation).
This is my basic pseudo-random Java test code -don’t even try to make sense of it!-:
| 
					 1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
 | 
						
public class Main {
 
    private static volatile int a;
 
    private static String s;
 
    public static void main(String[] args) {
 
        System.out.println("Jit example started");
 
        int max_iterations = 15000;
 
        while(max_iterations-- > 0) {
 
            f();
 
        }
 
        System.out.println("a (final): " + a);
 
        System.out.println("s (final): " + s);
 
    }
 
    private static void f() {
 
        int i = 100;
 
        while (i-- > 0) {
 
            a = a + 1234;
 
        }
 
        s = new Integer(a).toString();
 
    }
 
} 
 | 
					
Function Main::f is called many times in a loop so it will be compiled by C1 first.
This is Main::f in bytecodes:
| 
					 1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
 | 
						
private static void f();
 
 Code:
 
 0: bipush 100
 
 2: istore_0
 
 3: iload_0
 
 4: iinc 0, -1
 
 7: ifle 23
 
 10: getstatic #6 // Field a:I
 
 13: sipush 1234
 
 16: iadd
 
 17: putstatic #6 // Field a:I
 
 20: goto 3
 
 23: new #10 // class java/lang/Integer
 
 26: dup
 
 27: getstatic #6 // Field a:I
 
 30: invokespecial #11 // Method java/lang/Integer."<init>":(I)V
 
 33: invokevirtual #12 // Method java/lang/Integer.toString:()Ljava/lang/String;
 
 36: putstatic #8 // Field s:Ljava/lang/String;
 
 39: return
 
 | 
					
My obfuscation noise will be before any addition operation. In this case, before the iadd instruction.
This will be my x86_64 innocuous noise:
| 
					 1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
 | 
						
          PUSHFD
 
          PUSH RAX
 
          PUSH RDX
 
tryAgain:
 
          RDTSC
 
          CMP EAX, EDX
 
          JE tryAgain
 
          POP RDX
 
          POP RAX
 
          POPFD
 
 | 
					
I’m saving the EFLAGS, RAX and RDX registers first because their values will be destroyed -restoring original values is mandatory to keep the state integrity after the obfuscation code-. Then I get the timestamp with RDTSC and compare the lower and higher halves. They will likely be different. In the unrealistic case that they are equal, we try again. Then we restore original values and continue execution as if nothing happened.
In x86_64, this is Main::f JIT compiled method:
| 
					 1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
 | 
						
0x7f03f570c0c5:      cmp    $0x0,%esi
 
0x7f03f570c0c8:      je     0x7f03f570c281
 
0x7f03f570c0ce:      mov    $0x64,%edx
 
0x7f03f570c0d3:      jmpq   0x7f03f570c12f
 
0x7f03f570c0d8:      movabs $0x6d9a02658,%rdx
 
0x7f03f570c0e2:      mov    0x74(%rdx),%esi
 
0x7f03f570c0e5:      pushfq 
 
0x7f03f570c0e6:      push   %rax
 
0x7f03f570c0e7:      push   %rdx
 
0x7f03f570c0e8:      rdtsc  
 
0x7f03f570c0ea:      cmp    %edx,%eax
 
0x7f03f570c0ec:      je     0x7f03f570c0e8
 
0x7f03f570c0ee:      pop    %rdx
 
0x7f03f570c0ef:      pop    %rax
 
0x7f03f570c0f0:      popfq  
 
0x7f03f570c0f1:      add    $0x4d2,%esi
 
0x7f03f570c0f7:      mov    %esi,0x74(%rdx)
 
0x7f03f570c0fa:      lock addl $0x0,-0x40(%rsp)
 
0x7f03f570c100:      movabs $0x7f03d5cfcc60,%rdx
 
0x7f03f570c10a:      mov    0x24(%rdx),%esi
 
0x7f03f570c10d:      add    $0x8,%esi
 
0x7f03f570c110:      mov    %esi,0x24(%rdx)
 
 | 
					
I then decided to add some unaligned jumps as an anti-disassembly technique. Main::f now looks a bit more obscure:
| 
					 1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
 | 
						
0x7f4b64c52865:      push   %rax
 
0x7f4b64c52866:      lea    0x5(%rip),%rax        # 0x7f4b64c52872
 
0x7f4b64c5286d:      jmpq   *%rax
 
0x7f4b64c5286f:      callq  0x7f4b66b0d32f
 
0x7f4b64c52874:      callq  0x7f4b95d47b15
 
0x7f4b64c52879:      jmp    0x7f4b64c5287c
 
0x7f4b64c5287b:      callq  0x7f4b5c39eabb
 
0x7f4b64c52880:      jmp    0x7f4b64c52883
 
0x7f4b64c52882:      callq  0x7f4ae61dc5e1
 
0x7f4b64c52887:      (bad)  
 
0x7f4b64c52888:      rolb   %cl,(%rax,%rax,1)
 
0x7f4b64c5288b:      add    %cl,-0x7c0f8b8e(%rcx)
 
0x7f4b64c52891:      rex.R and $0xc0,%al
 
0x7f4b64c52894:      add    %cl,-0x46(%rax)
 
0x7f4b64c52897:      (bad)  
 
0x7f4b64c52898:      fmul   %st(7),%st
 
0x7f4b64c5289a:      xor    0x7f(%rbx),%ecx
 
 | 
					
Download patch here (based on JDK rev 89111a0e6355).
