Log in

View Full Version : UD2 opcode


TBone
April 19th, 2004, 15:55
As part of my ongoing effort to modernize my knowledge of how program execution works from top to bottom, I'm reading through the Intel IA-32 Architecture Software Developer's Manuals to brush up on assembly before I tackle more complex topics like PE/COFF format, etc. I haven't really studied the subject since the DOS days, when things were decidely simpler, heh.

Anyway, I came across this opcode and thought it was kind of amusing, so I played around with it a little. Intel's specifications indicate that it's functionally equivalent to NOP except that it raises an invalid opcode exception (#UD). Ok, sounds fun. So I wrote a little .asm file to test it out. It turns out that MASM doesn't know how to assemble a UD2 opcode, so I just assembled two NOPS instead:
Code:
.386
.model flat, stdcall

include \masm32\include\windows.inc
include \masm32\include\kernel32.inc

includelib \masm32\lib\kernel32.lib

.code
start:
NOP
NOP
invoke ExitProcess, NULL
end start

Then I pulled out a hex editor and changed the 90 90 to 0F 0B (UD2).

Executing the binary does nothing, which isn't all that surprising, but I thought that maybe Windows would at least bark about an invalid opcode exception. I tried loading it with OllyDbg and it generated the exception as soon as you start exceution. Passing the exception to the program caused it to transfer control to ntdll.dll. I don't have an import library built yet, so it's a little hard to tell what happens after that. According to Olly, the application couldn't handle the exception, and it terminates. Hmm, ok. It sounds like Windows itself just doesn't handle that exception. Am I interpreting this right?

Secondly, if you load the executable with IDA, it doesn't seem to know what to do with that opcode. W32DASM at least shows it as "invalid opcode", but IDA just makes a mess of it:
Code:
.text:00401000 _text segment para public 'CODE' use32
.text:00401000 assume cs:_text
.text:00401000 ;org 401000h
.text:00401000 assume es:nothing, ss:nothing, ds:nothing, fs:nothing, gs:nothing
.text:00401000 public start
.text:00401000 start dd 6A0B0Fh, 1E8h, 25FFCC00h, 402000h, 7Ch dup(0)
.text:00401000 _text ends


With a little undefining and manual defining, I can at least get:
Code:
.text:00401000 ; Segment type: Pure code
.text:00401000 _text segment para public 'CODE' use32
.text:00401000 assume cs:_text
.text:00401000 ;org 401000h
.text:00401000 assume es:nothing, ss:nothing, ds:nothing, fs:nothing, gs:nothing
.text:00401000 public start
.text:00401000 start db 0Fh ;
.text:00401001 db 0Bh ;
.text:00401002 ; ---------------------------------------------------------------------------
.text:00401002 push 0
.text:00401004 call ExitProcess
.text:00401009 int 3 ; Trap to Debugger
.text:0040100A ; [00000006 BYTES: COLLAPSED FUNCTION ExitProcess. PRESS KEYPAD "+" TO EXPAND]
.text:00401010 align 200h
.text:00401010 _text ends


Is there some way to define that opcode for IDA? It's not really a big deal; I just found it odd that the most highly esteemed decompiler had no idea how to handle it.

evaluator
April 19th, 2004, 22:25
search IDA support forum. maybe also RTM?

disavowed
April 19th, 2004, 23:00
evaluator, you should rtm if you think ida has an m to r

TBone
April 20th, 2004, 12:04
Erm, thanks...

I have been r-ing through the m, thank you. You can define custom operands, but as far as I can tell, not custom opcodes/instructions. The only thing that sounded halfway pertinant was this snippet:
Code:
Display bad instruction <BAD> marks

Some assemblers do not understand some instructions even if they
should to. For example, the Z80 processor has several undocumented
instructions and many assemblers fail to recognize them. IDA knows
about this fact and tries to produce an output that can be compiled
without errors, so it replaces such instructions with data bytes.

The problem is more severe with Intel 80x86 processors: the same
instruction can be coded differently. There are 2 operation codes
for ADD instruction, etc. The worst thing is that the different
operation codes have different lengths. If the assembler used to
compile a file and your assembler produce different operation codes,
you may obtain completely different output files.

That is why IDA can mark such instructions as <BAD> and replace them
with data bytes. Example:

Enabled:
db 0Fh,86h,7Eh,0,0,0 ; <BAD> jbe loc_0_205
db 0Fh,82h,78h,0,0,0 ; <BAD> jb loc_0_205
db 0Fh,83h,72h,0,0,0 ; <BAD> jnb loc_0_205
Disabled:
jbe loc_0_205
jb loc_0_205
jnb loc_0_205

IDA.CFG parameter: SHOW_BAD_INSTRUCTIONS

But as you can see from what I posted, the db XXh, XXh, XXh output that I'm getting doesn't result from <BAD> marking. If that were the case, we might see something like:
Code:
db 0Fh, 08h; <BAD> ud2

But as it is, IDA just doesn't seem to know what to do with it. At any rate I've tried it with SHOW_BAD_INSTRUCTIONS set to YES and to NO, and it predictably makes no difference.

I checked with the IDA forums -- there's no mention of UD2 specifically, and I can't find anything about handling unknown opcodes there, either. Like I said, it's no big deal. A) I was just curious. B) I thought it was interesting that it not only didn't interpret the 0F 0B opcode, but it also screwed up interpretation of adjacent code. That seems like kind of strange (and potentially exploitable) behavior. Seeding a few of these in key sections of code would make it a real PITA to disassemble it with IDA. It wouldn't really be a serious deterent; it's just one more thing in the bag o' tricks you could use to scramble and confuse reversing tools.

gabri3l
April 20th, 2004, 20:07
In the new intel manual it defines UD2 as two different opcodes. 0F0B and 0FB9, maybe IDA recognizes the other one?