Log in

View Full Version : Control Flow Deobfuscation Part 2


fr33ke
April 25th, 2008, 09:01
In part one ("http://www.woodmann.com/forum/blog.php?b=47") I explained how to get a CFG from bytecode, and today I'll tell you about the inverse operation.
Well, actually there are two parts to it:

Decide in which order you place the vertices
Put them in that order and insert the correct branches

I'll discuss point 1 in part three, and point 2 now.

It may sound trivial to do this: just get the data for the first vertex, then the branch for the first vertex etc. However this doesn't work because to create the branch we need to know the position of its targets. But for the position of the targets we need to know how big the branch is!

To create an optimal solution where all branches are as small as possible we use a simplification of Robertson's algorithm. The original paper is hard to find but there is good documentation of it in the source of YASM ( http://www.tortall.net/projects/yasm/browser/branches/yasm-0.7.x/libyasm/section.c#L698 ). Basically what we do is assume all branches are 0 bytes1, then increase the size for each branch that doesn't fit and repeat until we find a fixed point where all branches fit.

We can skip some sizes that are impossible, for instance there exists no 1 byte branch in .NET. However a 4 byte branch is possible with two 2 byte branches ("jnz xxx; jmp xxx". Also we don't have to recalculate every branch if one doesn't fit, only the ones that are affected by the increased branch size; but since I saw little performance increase (surprisingly), I chose to use the simpler version.

Now that I have explained everything, here is the pseudocode:
Code:
rebuild_from_order(cfg, order):
for vertex in order
size_of_branch[vertex] = 0

do
length = 0
for vertex in order
start[vertex] = length
length += vertex.length + size_of_branch[vertex]

has_changed = false
for vertex in order
for edge in vertex.edges
delta[edge] = start[edge] - (start + length(vertex.bytecode))
if not branch_fits(vertex.branchtype, delta, size_of_branch[vertex])
has_changed = true
do
size_of_branch[vertex]++
while not possible_branch_size(size_of_branch[vertex])
while has_changed

bytecode = empty
for vertex in order
length = 0
for vertex in order
start[vertex] = length
length += vertex.length + size_of_branch[vertex]

for edge in vertex.edges
delta[edge] = start[edge] - (start + vertex.length)

bytecode += vertex.bytecode
bytecode += create_branch(vertex.branchtype, delta)

return bytecode

See you in part 3 which should be ready in 2009


1 This is possible if it's an unconditional branch with a delta of 0 bytes: "jmp next; next:" is the same as "".

dELTA
April 28th, 2008, 04:21
Hehe, glad to see that this interesting article series is not dead. Looking forward to 2009 then.