I'm very curious about the jump obfuscation. Maybe somebody who's done more reverse-engineering can answer this for me:
a) Are unconditional jumps common enough that they couldn't be filtered out with some set of pre-conditions?
b) It seems like finding the end of a function would be easy, because there's a return. Is there some way to analyze the stack so that you know where a function is returning to, then look for a call immediately preceding the return address?
Apologies if I'm wrong about how this works, I haven't done much x86 assembly programming.Unconditional jumps are very common and everything in x86 assembly is very very messy after optimizations. Many functions do not end in ret.
This video[1] on reverse-engineering parts of Guitar Hero 3 covers a few similar techniques that were used to heavily obfuscate the game code that you might find interesting.
Few common issues.
1. Some jumps will be fake. 2. Some jumps will be inside an instruction. Decompilers can't handle two instructions are same location. (Like jmp 0x1234), you skip the jmp op, and assume 0x1234 is a valid instruction. 3. Stack will be fucked up in a branch, but is intentional to cause an exception. So you can either nop an instruction like lea RAX, [rsp + 0x99999999999] to fix decompilation, but then you may miss an intentional exception.
IDA doesn't handle stuff like this well, so I have a Binary Ninja license, and you can easily make a script that inlines functions for their decompiler. IDA can't really handle it since a thunnk (chunk of code between jmps), can only belong to one function. And the jmps will reuse chunks of code between eachother. I think most people don't use it since there was a bug with Binary Ninja in blizzard games, but they fixed it in a bug report a year or so ago.
Yeah, should be easy enough to filter these particular jumps out. It's an obfuscation designed to annoy people using common off-the-shelf tools (especially IDA pro)
Most obfuscations are only trying to annoy people just enough that they move on to other projects.
There's some other cool tricks you can do, where you symbolically execute using angr or another emulator such as https://github.com/cea-sec/miasm to be able to use control flow graph unflattening. You can also use Intel's PIN framework to do some interesting analysis. Some helpful articles here:
- https://calwa.re/reversing/obfuscation/binary-deobfuscation-...
- https://www.nccgroup.com/us/research-blog/a-look-at-some-rea...