Log in

View Full Version : [TOOL] IDA Deobfuscator plugin


mpompeo
January 13th, 2009, 11:54
Hi all,

this is my new, basic deobfuscator plugin for IDA.

Go to an obfuscated code sequence, start it (ALT+O), enter the end address (start address is already filled) and go.

I wont push much this POC version as it needs a full rewrite -it is a quick&dirt tool-, but I am curious to know 'how it works' for you, and where it mainly fails.


(didnt test it on 5.3, drop me a line if it doesnt work and I'll update)

edit--- 0.2 which fixexs checkboxes (made confusion with some flag :P )
edit--- 0.3 which add some basic constants accumulation.
edit--- 0.5, minor constant accumulation, minor push/pop folding
edit--- 0.51, fixed a bug introduced in 0.5 (an "IF" code flow): now it remove the junk and keep the good instruction (+fixed table-mode, now it works)
edit-- 0.6 few bugfixes, added layout in the stack for registers at end of deobfuscated block.
edit-- 0.7 common indirect register tracking of values (experimental)
edit-- 0.71 removed a debug check left for error, that caused frequent crashes...
edit-- 0.76b fixed a bunch of deadly errors, added 'selection' to be taken as start-end and few other changes

Sirmabus
January 22nd, 2009, 05:01
Bon giorno pizano,

What, and into what does it de-obfusticate to?
Can you post a little screen shot?

Shub-nigurrath
January 22nd, 2009, 10:19
what is not clear IMHO is what it deobfuscates from and to. From where the obfuscations infos are inserted and so on.. Could you provide a little video tutorial on how to use it?

mpompeo
January 22nd, 2009, 14:11
hi, here the sample of usage, + 0.75 which hopefully fixes all the ugly bugs I introduced when I had extended the constant folding in 0.6x-0.7x...

http://img141.imageshack.us/img141/6130/idadeob1dr5.jpg (http://imageshack.us)

it deobfuscates the following code (cut&paste from IDA)
Code:

sar dx, cl
inc ah
add dh, 6Eh
mov eax, [ebp+0]
neg dl
shl dx, 9
mov dx, [ebp+4]
cmp cl, al
pushf
clc
cmp cx, di
add ebp, 6
jmp loc_4250D1
[...]
loc_4250D1:
call $+5
push [dword ptr esp]
mov [ss:eax], dx
pushf
pushf
lea esp, [esp+14h]
jmp common_end


http://img220.imageshack.us/img220/669/idadeob1restm4.jpg (http://imageshack.us)

this is another sample, used in IDA: I just fille dthe start and end of obfuscated code, which is
Code:

common_end:

lea edx, [ebx+5FB827D6h]
btr dx, 4
rol al, cl
sal dl, 1
mov al, [esi-1]
cmp ah, dh
clc
shrd dx, bx, 1
xor al, bl
btc dx, 0Fh
movsx edx, cl
pushf
rol al, 3
sal dx, 0Ah
inc dx
add dh, 0B7h
btr edx, edx
xor al, 1
cmp esi, esi
adc dh, 25h
btr dx, cx
sub dl, bl
sub al, 37h
cmc
pop edx
call continue_1
continue_1:
neg al
inc dl
lea esi, [esi-1]
bsr dx, ax
dec al
pop edx
pusha
not al
pop edx
db 66h
bswap edx
movzx edx, bl
setnl dl
inc al
bts dx, bx
xor bl, al
cmc
clc
movzx eax, al
adc dl, al
db 66h
bswap edx
jmp continue_2
continue_2:
sbb dl, 17h
mov edx, [ds:Address_Table+eax*4]
mov [esp], ch
jmp continue_3
continue_3:
mov [esp+8], di
clc
bswap edx
call continue_4
proc continue_4 near
cmc
stc
add edx, 0D5B781C2h
push eax
jmp continue_5
continue_5:
bswap edx
cmc
cmp bh, 49h
bt sp, 1
pushf
sub edx, 89010C55h
cmc
cmp bh, 0BEh
jmp continue_end
continue_end:
clc
bswap edx
stc
add edx, 0
mov [byte ptr esp+0], 0Ch
mov [esp+return_address_of_ret____CARE], edx
pushf
push [esp+4+return_address_of_ret____CARE]
retn 2Ch


http://img220.imageshack.us/img220/8020/idadeob2resuh7.jpg (http://imageshack.us)

Basically, you set up a start/end, and then it provides to prune obfuscated code. Output is estabilished in the first 3 options, either notepad, IDB anterior comment (this EDIT your idb, care), or just in the IDA log window as messages. A little ida sdk note: call_system() wait the end of the called process, so until you close notepad you won't be able to click in IDA.

The 'table mode' just cycles thru a table of pointers deobfuscating all handles (useful for VM). When it fails, it stop, so you know the instruction that wasnt possible to deobfuscate (hopefully). To use the table mode, you need to position at start of table, fill 'pointers table end' and set table mode, of course. The 'max valid pointer address' is used to prevent invalid entries in handle table to cause problems.

The 'relative tracking' is explained in 1st sample: see the end layout line for EBP that says [EBP+0]=[EBP+4].

Stack constants moved back and forth the stack are tracked and sent to output, so you know how your stack will look like AFTER the deobfuscated sequence.

It can accumulate simple transformations whenever possible, and track down basic assignments.

Almost everywhere you will find an address before a line, as it allows a quick click-jump. Whenever you get to a Jcc or to a RET, it will ask you the address where it should jump to. Exiting there will 'stop' analysis there.

This is made to ease analysis, not to 'rewrite' code. I got an idea on a generic deobfuscation method and wanted to 'test it out'

Polaris
January 22nd, 2009, 15:30
Hello,

first of all, thanks for the contribution. The plugin seems to work well enough, however I am experiencing some access violation when trying to deobfuscate too much data.

Moreover, I have some usability suggestions:

what about using the "selection" to determine the boundaries of the code? IMHO, it is more convenient than navigating to the end of the obfuscated section, especially when dealing with huge amounts of code
again related to huge amount of code: it would be nice to either get rid of the messagebox related to the number of instruction read, or at least make it appear every 1000 instructions processed
please avoid spawning notepad. I would say that the txt file is enough, and if you want to have the display integrated into IDA you can create for example another tab and use that to display the results.


However, these are minor things, keep up the good job!!!

Sirmabus
January 22nd, 2009, 20:36
You could also just send it to the clipboard so one can paste the text into notepad or similar.
See: http://www.codeproject.com/KB/clipboard/clipboard_faq.aspx ("http://www.codeproject.com/KB/clipboard/clipboard_faq.aspx")

Humm, I've had an idea along the same lines.
What I would like to have (or make as it doesn't seem to exist yet) is a Themida clean up plug-in.

Basically something that goes through a dumped Themida target and cleans it up to make it more readable, thus easier to RE.
Ideally it would understand what Themida opcodes are and place names for them (I.E. "move a, b"
Might even take a modified IDA x86 processor module.

At least go through and mark the Themida'ized groups with colored text or something and then fix the real 0x86 code around them (IDA gets confused over it). Fixing functions that have Themida codes in them too, as it stands with the default (meta PC?) processor module IDA dosn't understand Themida so it won't make proper functions with those opcodes.

IMHO this could almost nullify what Themida and similar protections do when looking at dumps for RE purposes.
Even better if one can fix the IAT redirections (from a debugger plug-in), recognize the API emulations (using IDA sigs?), etc.

Edit: A lot of work in the area already I found today.
http://www.datasecurity-event.com/uploads/caro_obfuscation.ppt ("http://www.datasecurity-event.com/uploads/caro_obfuscation.ppt")
http://www.datasecurity-event.com/uploads/boris_lau_virtualization_obfs.pdf ("http://www.datasecurity-event.com/uploads/boris_lau_virtualization_obfs.pdf")
Etc..

mpompeo
January 30th, 2009, 18:48
uhmmm the plugin can handle at best 4k instructions in a single pass, sorry. However, in .76b I added the 'use selection' feat, np
anyway, I am working to a serious rewriting of the plugin, the code is just too "dirty" for my taste.
Next version will be available for both olly and IDA, as it will use libdisasm as 'neutral layer' for disasm purposes.

Analysing automatically complex VM... mah, it is a very complex subject, I am not very sure it is possible to make a generic solver for such things.
For sure, it looks a *very* interesting field of research.

Sab
January 30th, 2009, 20:55
But go for it. If anything will spark interest for new ideas/ other development, and maybe useful in many instances, even if not fully generic.

hazard
February 18th, 2009, 06:39
Nice plugin, is it possible to get source for study?