Log in

View Full Version : Opcode Operand Sizes


RITZ
June 27th, 2006, 15:45
I need to grab a chunk of code from the beginning of a function in memory, but it has to be to the nearest complete instruction so that I can arbitrarily concatenate some more code to it. In order to do this I'll obviously need to know the operand sizes of every opcode and trace through the instructions. Does anyone know of some code snippet or API already out there that I can use? Because right now what I'm doing is making a little program just to extract this information from the IA-32 instruction set manual PDFs and put it in a data form that I can use like a C array. But I know I'm not the first person to want to know the operand size for each opcode.

Another thing: I figure I can treat prefixes as just 0 sized opcodes but that's assuming that there are no IA-32 prefixes that would change the affected instruction's operand size. Am I correct in thinking this? I can't think of any that change the size off hand. If I'm right then the algo I need this for might just remain elegant.

ZaiRoN
June 27th, 2006, 16:42
To know the size of each instructions you have to write a sort of disasm engine. If you don't want to write it from zero you can use Ollydbg calling the Disasm function with DISASM_SIZE parameter.

Look at http://www.ollydbg.de/srcdescr.htm , you'll surely find what you need.

RITZ
June 27th, 2006, 18:56
Yeah, but I only need to do very very basic disassembly. I'm just looking for a simple C/C++ snippet. Or does anyone at least know a website or any document that I could cut the instruction operand sizes out of?

A normal disassembler would actually do intelligent interpretation, and alter the meaning of the operands with prefixes, but I only need to know what is what so that I don't let operands get interpreted as an opcodes.

LLXX
June 27th, 2006, 20:40
What you need is basically a partial disassembler, i.e. it can only identify instruction boundaries and nothing else.

The best method to use would be a lookup table for each of the 255 bytes that can be the first byte of an instruction, containing their length. Certain instructions such as int xx and int3 are fixed sizes, many others will have modrms/sibs that you will need to take into consideration.

deroko
June 27th, 2006, 20:48
you may find some nice length disassembler engines in 29a zine with sources, especially the ones coded by z0mbie (lde32, ade32 and xde)

RITZ
June 27th, 2006, 21:28
Quote:
[Originally Posted by LLXX]The best method to use would be a lookup table for each of the 255 bytes that can be the first byte of an instruction, containing their length. Certain instructions such as int xx and int3 are fixed sizes, many others will have modrms/sibs that you will need to take into consideration.

Uh-huh. The problem is not formulating such a simple algo. Of course I'll use a look-up table. But do you know of a web page or text document that I could easily parse to make that look-up table?

@deroko: So it looks like that was exactly what I needed. I hadn't known of the name, "length disassembler engine" to be able to search effectively - LDE32 should be perfect. Thanks!

RITZ
June 28th, 2006, 01:19
LDE32 just flat out gave the wrong results every time. But then I tried XDE, and that works fantastically. Thanks again!

Kayaker
July 3rd, 2006, 20:46
Hi

Just as an additional example whether useful or not, I found this the other night by chance

Sehuk Length Disassembler Engine
"VC++ Inline Asm implementation"
"mainly oriented toward Detour hooking or just hooking."

http://www.reversemode.com/?option=com_content&task=view&id=8&Itemid=2

There is also an example of using the LDE in a hooking package called 'Congrio'.

There are a few other articles there as well that might be of general interest, check in the 'downloads' section:

'Generic detection and classification of Polymorphic malware using Neural Pattern Recognition'

'Teenager Source Code' - several Linux files (0xf001 stuph

etc.

Kayaker

RITZ
July 3rd, 2006, 22:52
I found an elegant solution that didn't involve any disassembling.

LLXX
July 4th, 2006, 03:33
Quote:
[Originally Posted by RITZ]I found an elegant solution that didn't involve any disassembling.
I'll guess... you enabled the single-step flag, thus forcing every breakpoint to be on an instruction boundary

RITZ
July 4th, 2006, 20:29
Actually, I avoided needed to know the length all together.

http://woodmann.net/forum/showpost.php?p=59440&postcount=17