UD2 opcode [Archive] - RCE Messageboard's Regroupment

View Full Version : UD2 opcode

TBone

April 19th, 2004, 15:55

As part of my ongoing effort to modernize my knowledge of how program execution works from top to bottom, I'm reading through the Intel IA-32 Architecture Software Developer's Manuals to brush up on assembly before I tackle more complex topics like PE/COFF format, etc. I haven't really studied the subject since the DOS days, when things were decidely simpler, heh.

Anyway, I came across this opcode and thought it was kind of amusing, so I played around with it a little. Intel's specifications indicate that it's functionally equivalent to NOP except that it raises an invalid opcode exception (#UD). Ok, sounds fun. So I wrote a little .asm file to test it out. It turns out that MASM doesn't know how to assemble a UD2 opcode, so I just assembled two NOPS instead:

Code:

.386

.model flat, stdcall



include \masm32\include\windows.inc

include \masm32\include\kernel32.inc



includelib \masm32\lib\kernel32.lib



.code

start:

NOP

NOP

invoke ExitProcess, NULL

end start

Then I pulled out a hex editor and changed the 90 90 to 0F 0B (UD2).

Executing the binary does nothing, which isn't all that surprising, but I thought that maybe Windows would at least bark about an invalid opcode exception. I tried loading it with OllyDbg and it generated the exception as soon as you start exceution. Passing the exception to the program caused it to transfer control to ntdll.dll. I don't have an import library built yet, so it's a little hard to tell what happens after that. According to Olly, the application couldn't handle the exception, and it terminates. Hmm, ok. It sounds like Windows itself just doesn't handle that exception. Am I interpreting this right?

Secondly, if you load the executable with IDA, it doesn't seem to know what to do with that opcode. W32DASM at least shows it as "invalid opcode", but IDA just makes a mess of it:

Code:

.text:00401000 _text           segment para public 'CODE' use32

.text:00401000                 assume cs:_text

.text:00401000                 ;org 401000h

.text:00401000                 assume es:nothing, ss:nothing, ds:nothing, fs:nothing, gs:nothing

.text:00401000                 public start

.text:00401000 start           dd 6A0B0Fh, 1E8h, 25FFCC00h, 402000h, 7Ch dup(0)

.text:00401000 _text           ends

With a little undefining and manual defining, I can at least get:

Code:

.text:00401000 ; Segment type: Pure code

.text:00401000 _text           segment para public 'CODE' use32

.text:00401000                 assume cs:_text

.text:00401000                 ;org 401000h

.text:00401000                 assume es:nothing, ss:nothing, ds:nothing, fs:nothing, gs:nothing

.text:00401000                 public start

.text:00401000 start           db  0Fh ;  

.text:00401001                 db  0Bh ;  

.text:00401002 ; ---------------------------------------------------------------------------

.text:00401002                 push    0

.text:00401004                 call    ExitProcess

.text:00401009                 int     3               ; Trap to Debugger

.text:0040100A ; [00000006 BYTES: COLLAPSED FUNCTION ExitProcess. PRESS KEYPAD "+" TO EXPAND]

.text:00401010                 align 200h

.text:00401010 _text           ends

Is there some way to define that opcode for IDA? It's not really a big deal; I just found it odd that the most highly esteemed decompiler had no idea how to handle it.

evaluator

April 19th, 2004, 22:25

search IDA support forum. maybe also RTM?

disavowed

April 19th, 2004, 23:00

evaluator, you should rtm if you think ida has an m to r

TBone

April 20th, 2004, 12:04

Erm, thanks...

I have been r-ing through the m, thank you. You can define custom operands, but as far as I can tell, not custom opcodes/instructions. The only thing that sounded halfway pertinant was this snippet:

Code:

Display bad instruction <BAD> marks



        Some assemblers do not understand some instructions even if they

        should to. For example, the Z80 processor has several undocumented

        instructions and many assemblers fail to recognize them. IDA knows

        about this fact and tries to produce an output that can be compiled

        without errors, so it replaces such instructions with data bytes.



        The problem is more severe with Intel 80x86 processors: the same

        instruction can be coded differently. There are 2 operation codes

        for ADD instruction, etc. The worst thing is that the different

        operation codes have different lengths. If the assembler used to

        compile a file and your assembler produce different operation codes,

        you may obtain completely different output files.



        That is why IDA can mark such instructions as <BAD> and replace them

        with data bytes. Example:



           Enabled:

                        db 0Fh,86h,7Eh,0,0,0 ; <BAD> jbe     loc_0_205

                        db 0Fh,82h,78h,0,0,0 ; <BAD> jb      loc_0_205

                        db 0Fh,83h,72h,0,0,0 ; <BAD> jnb     loc_0_205

           Disabled:

                        jbe     loc_0_205

                        jb      loc_0_205

                        jnb     loc_0_205



        IDA.CFG parameter: SHOW_BAD_INSTRUCTIONS

But as you can see from what I posted, the db XXh, XXh, XXh output that I'm getting doesn't result from <BAD> marking. If that were the case, we might see something like:

Code:

db 0Fh, 08h; <BAD> ud2

But as it is, IDA just doesn't seem to know what to do with it. At any rate I've tried it with SHOW_BAD_INSTRUCTIONS set to YES and to NO, and it predictably makes no difference.

I checked with the IDA forums -- there's no mention of UD2 specifically, and I can't find anything about handling unknown opcodes there, either. Like I said, it's no big deal. A) I was just curious. B) I thought it was interesting that it not only didn't interpret the 0F 0B opcode, but it also screwed up interpretation of adjacent code. That seems like kind of strange (and potentially exploitable) behavior. Seeding a few of these in key sections of code would make it a real PITA to disassemble it with IDA. It wouldn't really be a serious deterent; it's just one more thing in the bag o' tricks you could use to scramble and confuse reversing tools.

gabri3l

April 20th, 2004, 20:07

In the new intel manual it defines UD2 as two different opcodes. 0F0B and 0FB9, maybe IDA recognizes the other one?