Extracting VMProtect handlers with Binary Ninja
Automatically converting Binary Ninja Low Level IL (LLIL) into python
I've started looking into the Adylkuzz malware, as mentioned by Tim Blazytko in his article on Automated Detection of Obfuscated Code. Initial analysis shows a TLS entry handler that dumps us straight into a VMProtect VMEnter()
function, that looks like this in the HLIL:
005becad int32_t var_4 = arg4
005becb0 _bswap(not.d(arg4))
005becb5 int32_t var_8 = arg2
005becb6 void* const var_c = 0xea6bdba7
005becb7 int32_t ebx // junk
005becb7 bool s
005becb7 bool o
005becb7 ebx.b = s != o
005becbc int32_t eax
005becbc int32_t var_10 = eax
005becbd eax:1.b = 0xcf // junk
005becc2 int32_t var_14 = arg1
005becc6 int32_t edi
005becc6 int32_t var_18 = edi
005becc7 int32_t var_1c = arg3
005becc8 int32_t ebx_1 // junk
005becc8 ebx_1.w = 0x7f28
005beccf bool c
005beccf bool p
005beccf bool a
005beccf bool z
005beccf bool d
005beccf int32_t var_20 = (o ? 1 : 0) << 0xb | (d ? 1 : 0) << 0xa | (s ? 1 : 0) << 7 | (z ? 1 : 0) << 6 | (a ? 1 : 0) << 4 | (p ? 1 : 0) << 2 | (c ? 1 : 0) << 0
005becd5 edi.w = 0x5e8d // junk
005becd9 int32_t var_24 = 0
005bece3 arg3.w = arg3.w & not.w(1 << modu.w(arg2.w, 0x10)) // junk
005bece8 int32_t esi_4 = neg.d(arg5 + 1)
005becea int32_t eflags // junk
005becea uint16_t temp0
005becea temp0, eflags = _bit_scan_reverse(esi_4.w)
⋯005becf9 bool c_1 = unimplemented {ror esi, 0x1}
005bed02 edi.w = rlc.w(edi.w, 0x2a, c_1)
005bed0e int32_t eax_1
005bed0e eax_1.w = 0x253c
005bed3c int16_t eax_2
005bed3c eax_2:1.b = ror.w(0 ^ (ror.d(esi_4 ^ 0x27e9128c, 1) + 1).w, 0x72):1.b << arg1.b
005bed54 int32_t eax_8 = rol.d(not.d((*(ror.d(esi_4 ^ 0x27e9128c, 1) - 3) ^ (ror.d(esi_4 ^ 0x27e9128c, 1) + 1)) - 0x5ca20a41) - 0x40d54c06, 2)
005bed61 int32_t var_e8 = 0x5bed29 + eax_8
005bed62 return eax_8
It's a little bit hard to follow, partially because VMProtect is well known for using a lot of junk instructions. If we clean up the ASM it looks like this:
push esi
push edx
push ebx
push eax
push ecx
push edi
push ebp
pushfd
mov eax, 0x0 // this gets relocated
push eax
mov esi, dword [esp+0x28] // encrypted VIP
inc esi
neg esi
xor esi, 0x27e9128c
ror esi, 0x1
inc esi
lea esi, [esi+eax]
mov ebp, esp
lea esp, [esp-0xc0]
mov ebx, esi
mov eax, 0x0 // this gets relocated
sub ebx, eax
lea edi, [0x5bed29]
lea esi, [esi-0x4]
mov eax, dword [esi]
xor eax, ebx
lea eax, [eax-0x5ca20a41]
not eax
sub eax, 0x40d54c06
rol eax, 0x2
xor ebx, eax
add edi, eax
push edi
retn // obfuscated jump to first VM handler
This is a little yuck for us, even once we've removed the junk instructions by hand. Effectively, it pushes all the registers and flags, decrypts the VIP that is passed as the only argument on the stack, initialises the stream cipher with this, then decrypts the first instruction handler pointer and jumps to it. If we look at the way the HLIL has been done, you can see this actually sums this up fairly well for us. If we change to SSA view we can make sure nothing is getting clobbered that we care about:
005bed54 int32_t eax_8#1 = rol.d(not.d((*(ror.d(esi_4#1 ^ 0x27e9128c, 1) - 3) @ mem#2 ^ (ror.d(esi_4#1 ^ 0x27e9128c, 1) + 1)) - 0x5ca20a41) - 0x40d54c06, 2)
005bed61 int32_t var_e8#1 = 0x5bed29 + eax_8#1
005bed62 return eax_8#1
The value of var_e8#1
is what we're most interested in as in practice we aren't returning anything, we are jumping to that location. Because we've put this in SSA view, we can be confident when we look back up the view to see this is only based on esi#1
, and this is defined earlier:
int32_t esi_4#1 = neg.d(arg5#0 + 1) // arg5 is actually the only arg pushed on the stack
So if we want to find the address of the first handler, we just need to get the original encrypted VIP passed on the stack (0xd8cb8f6d in this case), and we can evaluate this equation ourself to get the first handler at 0x5babbd
. Looking at the first handler in the HLIL though, we can see we need to look a little deeper to get the information we want:
005babc7 arg1#1.b = arg1#0.b & nullptr
005babcc uint32_t eax
005babcc eax#1.b = *(arg3#0 - 1) @ mem#0 ^ 0xa7
005babde eax#2.b = not.b(eax#1.b)
005babe0 eax#3.b = eax#2.b - 0xa3
005babe2 eax#4.b = rol.b(eax#3.b, 1)
005babea eax#5.b = eax#4.b + 1
005babee eax#6.b = ror.b(eax#5.b, 1)
005babf0 int32_t ebx
005babf0 ebx#1.b = 0xa7 ^ eax#6.b
005babfd *(&__return_addr + eax#6) @ mem#0 @ mem#1 = *arg2#0 @ mem#0 @ mem#0
⋯005bac3d jump(arg4#0 + rol.d(not.d((*(arg3#0 - 5) @ mem#1 ^ ebx#1) - 0x5ca20a41) - 0x40d54c06, 2))
Where is ebx#0
? What is *(&__return_addr + eax#6)
? From what I can see, a big part of the problem is that Binary Ninja is assuming that the functions we reverse are playing along nicely with the x86 standard, and for example, that esp
is the stack pointer and that it holds a reference to a return address. I had a play with trying to extract something useful out of the HLIL and MLIL, but I had a hard time with the following:
- If you follow the AST back to the input arguments, there's no direct way to see if they're backed by variables. You can do this by checking the function type information but it's a little clunky
- We really want to track the key registers that drive the VMProtect virtual machine, and the MLIL and HLIL views are a little bit abstracted from the assembly, so it's not the best tool for the job
That leaves us with the LLIL, and the short answer is, this gives us the sweet spot where we're very close to the original assembly, but also have some of the heavy lifting (SSA and lifting into IL) done for us.
VMProtect 3 has been described elsewhere (here and here among others), and the basic idea is this:
esi
is the virtual instruction pointer,VIP
edi
is the offset of the current VM handler (opcodes are offsets from the previous handler so we need to track this)esp
is the offset to the scratch registersebp
is the stack pointer for the VMebx
is the stream cipher that is used to decrypt the stream of opcodes
If we can see what these registers resolve to at the end of the handler then we can find the address of the next handler, and profile the current one to automatically identify it. Firstly though, let's start getting into how the interface works
Accessing LLIL instructions from the Python Console
Click on an instruction to highlight it, and get the function that holds this (the functions we need hang off the binaryninja.function.Function
class):
>>> func = bv.get_functions_containing(here)[0]
>>> func
<func: x86@0x5babbd>
It's possible that bv.get_functions_containing()
could return multiple functions (or none, if we're outside a function), but let's live dangerously here and assume there's only going to be one function returned. From here, we want to get the LLIL in SSA form:
>>> llil_ssa = func.llil.ssa_form
>>> llil_ssa
<llil func: x86@0x5babbd>
>>> llil_ssa.registers
[<reg ecx>, <reg esp>, <reg ebp>, <reg edx>, <reg eax>, <reg esi>, <reg ebx>, <reg temp1>, <reg edi>, <reg temp0>]
>>> llil_ssa.ssa_registers
[<ssa ecx version 0>, <ssa ecx version 1>, <ssa ecx version 2>, <ssa ecx version 3>, <ssa ecx version 4>, <ssa ecx version 5>, <ssa ecx version 6>, <ssa ecx version 7>, <ssa esp version 0>, <ssa ebp version 0>, <ssa ebp version 1>, <ssa edx version 0>, <ssa eax version 1>, <ssa eax version 2>, <ssa eax version 3>, <ssa eax version 4>, <ssa eax version 5>, <ssa eax version 6>, <ssa eax version 7>, <ssa eax version 8>, <ssa eax version 9>, <ssa eax version 10>, <ssa eax version 11>, <ssa eax version 12>, <ssa eax version 13>, <ssa eax version 14>, <ssa eax version 15>, <ssa eax version 16>, <ssa esi version 0>, <ssa esi version 1>, <ssa esi version 2>, <ssa ebx version 0>, <ssa ebx version 1>, <ssa ebx version 2>, <ssa temp1 version 1>, <ssa edi version 0>, <ssa edi version 1>, <ssa temp0 version 1>]
>>> llil_ssa[0]
<llil: esi#1 = esi#0 - 1>
>>> llil_ssa[1]
<llil: eax#1 = zx.d([esi#1].b @ mem#0)>
From here we can access both the base registers and their SSA forms. At a glance, we can see eax
gets heavily used, whereas edi
and esi
don't see a huge amount of action. We can also subscript the llil.ssa_form
object, which returns instructions for each line in the view.
At this point it's going to be quite useful to look at the LLIL docs. There are two main types of instructions we'll care about here:
>>> llil_ssa[1]
<llil: eax#1 = zx.d([esi#1].b @ mem#0)>
>>> type(llil_ssa[1])
<class 'binaryninja.lowlevelil.LowLevelILSetRegSsa'>
>>> llil_ssa[5]
<llil: eax#2.al = eax#1.al ^ ebx#0.bl>
>>> type(llil_ssa[5])
<class 'binaryninja.lowlevelil.LowLevelILSetRegSsaPartial'>
As you can probably guess from the name, the LowLevelILSetRegSsa
class represents a register being set to a new value, whereas the LowLevelILSetRegSsaPartial
class represents part of the register being set, e.g. bl
, which is the low byte of ebx
, or si
, which is the low word of esi
. As far as I could tell, all the instructions subclass LowLevelILInstruction
directly, rather than subclassing something more specific like a LowLevelILAssignment
class, so we need to handle these directly. It's important to note that various instructions that modify flags but otherwise don't do anything get represented here too, and you often find these in the junk code, for example:
>>> llil_ssa[7]
<llil: esi#1 & 0x4de74ba7>
>>> type(llil_ssa[7])
<class 'binaryninja.lowlevelil.LowLevelILAnd'>
>>> bv.get_disassembly(llil_ssa[7].address)
'test esi, 0x4de74ba7'
The good news is we shouldn't have to worry about this as we'll just be tracking register definitions (although this will be a pain later when we get to managing flags). If we look at these objects, we have a few parameters that we care about:
>>> llil_ssa[1]
<llil: eax#1 = zx.d([esi#1].b @ mem#0)>
>>> llil_ssa[1].dest
<ssa eax version 1>
>>> type(llil_ssa[1].dest)
<class 'binaryninja.lowlevelil.SSARegister'>
>>> llil_ssa[1].src
<llil: zx.d([esi#1].b @ mem#0)>
>>> type(llil_ssa[1].src)
<class 'binaryninja.lowlevelil.LowLevelILZx'>
For a LowLevelILSetRegSsa
object we can use the dest
property to get the SSARegister
that is being written to. In the src
property we will see the tree of LowLevelILInstruction
objects that will end with either registers or constants. For nearly all of these, we can use the operands
property to access the nodes further up the tree:
>>> llil_ssa[40]
<llil: eax#15 = eax#14 - 0x40d54c06>
>>> type(llil_ssa[40])
<class 'binaryninja.lowlevelil.LowLevelILSetRegSsa'>
>>> llil_ssa[40].src.operands
[<llil: eax#14>, <llil: 0x40d54c06>]
>>> [type(x) for x in llil_ssa[40].src.operands]
[<class 'binaryninja.lowlevelil.LowLevelILRegSsa'>, <class 'binaryninja.lowlevelil.LowLevelILConst'>]
>>> llil_ssa[19]
<llil: ebx#1.bl = ebx#0.bl ^ eax#7.al>
>>> type(llil_ssa[19])
<class 'binaryninja.lowlevelil.LowLevelILSetRegSsaPartial'>
>>> llil_ssa[19].src.operands
[<llil: ebx#0.bl>, <llil: eax#7.al>]
>>> [type(x) for x in llil_ssa[19].src.operands]
[<class 'binaryninja.lowlevelil.LowLevelILRegSsaPartial'>, <class 'binaryninja.lowlevelil.LowLevelILRegSsaPartial'>]
Because this has been lifted directly from the assembly, we should generally see only one layer of instructions, unless it's a complex instruction like a movsx
which will both access a memory location and zero extends it
>>> llil_ssa[1]
<llil: eax#1 = zx.d([esi#1].b @ mem#0)>
>>> llil_ssa[1].src
<llil: zx.d([esi#1].b @ mem#0)>
>>> llil_ssa[1].src.src
<llil: [esi#1].b @ mem#0>
>>> llil_ssa[1].src.src.src
<llil: esi#1>
>>> bv.get_disassembly(llil_ssa[1].address)
'movzx eax, byte [esi]'
We have all the pieces we need to build our extractor now. We could resolve the SSA registers directly and build complete ASTs, but I've chosen to just resolve each instruction one at a time and output something close to Python (you'll need to implement the zx()
and mem_read()
functions yourself).
Building the extractor
We'll start with a simple function to get going:
def resolve_dest(dest):
if type(dest) == SSARegister:
return "%s_%s" % (dest.reg, dest.version)
else:
raise Exception("Couldn't resolve destination %s type %s" % (dest, type(dest)))
We can copy-paste this directly into the python console and call it whenever we like. Let's try this - select an instruction (I'm going to do line 1 in the LLIL in SSA form), and execute this:
>>> bv.get_functions_containing(here)[0].get_llil_at(here)
<llil: eax = zx.d([esi].b)>
>>> bv.get_functions_containing(here)[0].get_llil_at(here).ssa_form
<llil: eax#1 = zx.d([esi#1].b @ mem#0)>
>>> bv.get_functions_containing(here)[0].get_llil_at(here).ssa_form.dest
<ssa eax version 1>
>>> resolve_dest(bv.get_functions_containing(here)[0].get_llil_at(here).ssa_form.dest)
'eax_1'
That was pretty easy. Let's try resolving the sources. I've decided to do this through a loop rather than through recursion, partially because I got confused debugging this when I tried it the recursive way, and partially because I haven't done it this way for a while and needed the practice. What we do is we do a depth first traversal of the tree, ordering objects in our todo
array, and adding any non-leaf nodes back onto the sources
array to make sure we traverse them too:
sources = [source]
todo = []
output = []
while sources:
source = sources.pop()
if type(source) in [LowLevelILSub, LowLevelILZx, LowLevelILSx, LowLevelILAnd, LowLevelILXor, LowLevelILOr, LowLevelILNot, LowLevelILLsl, LowLevelILLsr, LowLevelILRol, LowLevelILRor, LowLevelILAdd]:
todo.append(source)
for operand in source.operands:
sources.append(operand)
elif type(source) in [LowLevelILLoadSsa]:
# operands are [src, src_memory] and src_memory is just an int ref we don't want
todo.append(source)
sources.append(source.src)
elif type(source) in [LowLevelILConst, LowLevelILRegSsa, LowLevelILRegSsaPartial]:
todo.append(source)
else:
raise Exception("Couldn't process instruction %s type %s" % (source, type(source)))
Now we can process the outputs. Some of the assignments will be directly setting a value, so we can handle these first:
if type(value) == LowLevelILConst:
output.append(hex(value.constant))
elif type(value) == LowLevelILRegSsa:
output.append("%s_%s" % (value.src.reg.name, value.src.version))
elif type(value) == LowLevelILRegSsaPartial:
result = "(%s & %s_%s)" % (hex(masks[value.src.name]), value.full_reg.reg.name, value.full_reg.version)
if value.src.name in shifts:
result = "(%s %s)" % (result, shifts[value.src.name])
output.append(result)
Feel free to ignore the LowLevelILRegSsaPartial
implementation here, or skip forward to the source to see how this all hooks up. This is always going to be an implementation decision, and a framework like Triton has complex objects for registers that manage the smaller parts, but I've chosen here just to mask things, which complicates the output, but it makes it easy to follow. We could easily decide here to resolve the registers and insert them in place if we wanted to build an AST, this is an exercise for the reader.
Note: python doesn't have unsigned integers, and things will behave weirdly when we have negative numbers interacting with bitwise arithmetic. I haven't implemented this very carefully and there will be bugs with this.
Disclaimers aside, all we need to do is print out a representation of the constants and registers we come across, they will be the leaf nodes.
Most of the rest look more or less the same, I've chosen to output textual representations of these, but there's no reason we couldn't output other objects that can perform the calculations themselves.
elif type(value) == LowLevelILAdd:
rhs = output.pop()
lhs = output.pop()
output.append("(%s + %s)" % (lhs, rhs))
elif type(value) == LowLevelILSub:
rhs = output.pop()
lhs = output.pop()
Finally, we resolve the assignments:
if type(assignment) == LowLevelILSetRegSsa:
return "%s = %s" % (resolve_dest(assignment.dest), resolve_source(assignment.src))
elif type(assignment) == LowLevelILSetRegSsaPartial:
previous_version = "%s_%s" % (assignment.full_reg.reg, assignment.full_reg.version - 1)
output = resolve_dest(assignment.full_reg)
original = "(%s & %s)" % (hex(inverse_masks[assignment.dest.name]), previous_version)
change = "(%s & %s)" % (hex(masks[assignment.dest.name]), resolve_source(assignment.src))
full_src = "%s & %s" % (original, change)
if assignment.dest.name in shifts:
full_src = "(%s %s)" % (full_src, shifts[assignment.dest.name])
return "%s = %s" % (output, full_src)
We've outsourced the source and destination resolution so the LowLevelILSetRegSsa
case is very straightforward, and the LowLevelILSetRegSsaPartial
just adds a bunch of masking and shifting to make the partial registers behave correctly.
Looking up dependencies
The goal is to extract the operations from the handler, so let's resolve all dependent registers back to the top and output all the lines we need to calculate the outputs ourselves.
def find_all_dependent_registers(func, llil_ssa, base_assignment):
assignments = [base_assignment]
output_assignments = []
while assignments:
assignment = assignments.pop()
log_info("Analysing assignment %s" % assignment)
output_assignments.append(assignment)
dependent_registers = find_dependent_registers(assignment)
for register in dependent_registers:
log_info("Adding dependent register %s" % register)
assignment = llil_ssa.get_ssa_reg_definition(register)
if assignment:
log_info("Defined at: %s" % assignment)
assignments.append(assignment)
else:
log_info("Register %s has no definition, skipping" % register)
# convert to pythonesque
output_python = []
while output_assignments:
output_python.append(resolve_assignment(output_assignments.pop()))
return output_python
We use the get_ssa_reg_definition()
to find where our registers are defined, and then apply the same iterative depth-first traversal as before. This leaves us with a bunch of assignments in an array. We want to start from the top so we read this array back in reverse, and the resolve_assignment()
function generates the output code we want. This will produce duplicate lines of code and we could remove these from the output_python
array if we want, but SSA should mean all of our lines are idempotent so it shouldn't hurt to repeat them.
We'll add some helper functions too, in case we want to start from a specific address, or use a register name to find the final SSA version of it and calculate for this.
So what does the output look like?
>>> print("\n".join(find_all_dependent_registers_from_register_name(func, "esi")))
esi_1 = (esi_0 - 0x1)
esi_2 = (esi_1 - 0x4)
>>> print("\n".join(find_all_dependent_registers_from_register_name(func, "edi")))
esi_1 = (esi_0 - 0x1)
eax_1 = zx(read_mem(esi_1,1), 4)
eax_2 = (0xffffff00 & eax_1) & (0xff & ((0xff & eax_1) ^ (0xff & ebx_0)))
eax_3 = (0xffffff00 & eax_2) & (0xff & not((0xff & eax_2), 1))
eax_4 = (0xffffff00 & eax_3) & (0xff & ((0xff & eax_3) - -0x5d))
eax_5 = (0xffffff00 & eax_4) & (0xff & (0xFF & (((0xff & eax_4) << 0x1) | ((0xff & eax_4) >> (8 - 0x1)))))
eax_6 = (0xffffff00 & eax_5) & (0xff & ((0xff & eax_5) + 0x1))
eax_7 = (0xffffff00 & eax_6) & (0xff & (0xFF & (((0xff & eax_6) >> 0x1) | ((0xff & eax_6) << (8 - 0x1)))))
ebx_1 = (0xffffff00 & ebx_0) & (0xff & ((0xff & ebx_0) ^ (0xff & eax_7)))
esi_1 = (esi_0 - 0x1)
esi_2 = (esi_1 - 0x4)
eax_11 = read_mem(esi_2,4)
eax_12 = (eax_11 ^ ebx_1)
eax_13 = (eax_12 + -0x5ca20a41)
eax_14 = not(eax_13, 4)
eax_15 = (eax_14 - 0x40d54c06)
eax_16 = (0xFFFFFFFF & ((eax_15 << 0x2) | (eax_15 >> (32 - 0x2))))
edi_1 = (edi_0 + eax_16)
>>> print("\n".join(find_all_dependent_registers_from_register_name(func, "ebp")))
ebp_1 = (ebp_0 + 0x4)
>>> print("\n".join(find_all_dependent_registers_from_register_name(func, "esp")))
>>> print("\n".join(find_all_dependent_registers_from_register_name(func, "ebx")))
esi_1 = (esi_0 - 0x1)
eax_1 = zx(read_mem(esi_1,1), 4)
eax_2 = (0xffffff00 & eax_1) & (0xff & ((0xff & eax_1) ^ (0xff & ebx_0)))
eax_3 = (0xffffff00 & eax_2) & (0xff & not((0xff & eax_2), 1))
eax_4 = (0xffffff00 & eax_3) & (0xff & ((0xff & eax_3) - -0x5d))
eax_5 = (0xffffff00 & eax_4) & (0xff & (0xFF & (((0xff & eax_4) << 0x1) | ((0xff & eax_4) >> (8 - 0x1)))))
eax_6 = (0xffffff00 & eax_5) & (0xff & ((0xff & eax_5) + 0x1))
eax_7 = (0xffffff00 & eax_6) & (0xff & (0xFF & (((0xff & eax_6) >> 0x1) | ((0xff & eax_6) << (8 - 0x1)))))
ebx_1 = (0xffffff00 & ebx_0) & (0xff & ((0xff & ebx_0) ^ (0xff & eax_7)))
esi_1 = (esi_0 - 0x1)
esi_2 = (esi_1 - 0x4)
eax_11 = read_mem(esi_2,4)
eax_12 = (eax_11 ^ ebx_1)
eax_13 = (eax_12 + -0x5ca20a41)
eax_14 = not(eax_13, 4)
eax_15 = (eax_14 - 0x40d54c06)
eax_16 = (0xFFFFFFFF & ((eax_15 << 0x2) | (eax_15 >> (32 - 0x2))))
esi_1 = (esi_0 - 0x1)
eax_1 = zx(read_mem(esi_1,1), 4)
eax_2 = (0xffffff00 & eax_1) & (0xff & ((0xff & eax_1) ^ (0xff & ebx_0)))
eax_3 = (0xffffff00 & eax_2) & (0xff & not((0xff & eax_2), 1))
eax_4 = (0xffffff00 & eax_3) & (0xff & ((0xff & eax_3) - -0x5d))
eax_5 = (0xffffff00 & eax_4) & (0xff & (0xFF & (((0xff & eax_4) << 0x1) | ((0xff & eax_4) >> (8 - 0x1)))))
eax_6 = (0xffffff00 & eax_5) & (0xff & ((0xff & eax_5) + 0x1))
eax_7 = (0xffffff00 & eax_6) & (0xff & (0xFF & (((0xff & eax_6) >> 0x1) | ((0xff & eax_6) << (8 - 0x1)))))
ebx_1 = (0xffffff00 & ebx_0) & (0xff & ((0xff & ebx_0) ^ (0xff & eax_7)))
ebx_2 = (ebx_1 ^ eax_16)
Lots of repetition caused by the ebx
decryption, but we can also see a couple of main things:
- our
VIP
register,esi
gets decremented by 5 (in this VMProtect VM, theVIP
counts backwards), which means we're reading 1 byte from the bytecode, and then a final DWORD to get the address of the next handler - our stack pointer register,
ebp
gets advanced by 4, which suggests we popped a DWORD off the virtual stack, but didn't put anything back on (so we haven't done any arithmetic)
These two things alone are pretty good clues that we've loaded a DWORD from the stack and put it into a virtual register in the scratch space. We haven't handled memory writes, and the next important step would be to find all LowLevelILStoreSsa
instructions and collect them somewhere too:
>>> llil_ssa[24]
<llil: [esp#0 + eax#7].d = ecx#7 @ mem#0 -> mem#1>
>>> type(llil_ssa[24])
<class 'binaryninja.lowlevelil.LowLevelILStoreSsa'>
With heuristics we would know that since esp
is our scratch space base, we just need to resolve eax#7
and we'll know which number register we are writing to.
In any case, code is at https://github.com/samrussell/vmprotect_binja_plugin, feel free to have a play with it and see what else you can do
Takeaways
It took a while to find the right level to look at, but ultimately the Binary Ninja LLIL is very useful, the Python interface is fantastic for interacting with it, and it does about 90% of the heavy lifting for us. I suspect once we get to the arithmetic operations we'll run into some problems with managing where the flags originate from, and that will require us to step backwards through the instruction array rather than directly access these. The LLIL does keep track of some flags that are directly set (there are 8 versions of the carry flag in this handler, for example), but we will have to implement the flag calculation for arithmetic ourselves. Having said this, the flag usage in the handlers is fairly straightforward in earlier versions of VMProtect, and the problems only arise when handling the lifted opcodes in later analysis.
Another nice surprise was how the HLIL was really useful in finding the address of the first VM handler, and it would be nice if there was a way to customise this more. The dead code and obfuscated jump handling isn't perfect, but we do get a bunch of stuff for free from both the HLIL and the LLIL, and I feel like Binary Ninja is going to be quite a useful tool for handling a sample like this.
Anyway, I hope you got something out of this. Good luck and happy reversing.