Calc.exe Disassembly [Archive] - RCE Messageboard's Regroupment

View Full Version : Calc.exe Disassembly

414B

May 1st, 2007, 19:27

Hey,
I am attempting to write a program that will read a PE file and find the start of the .text section (PointerToRawData) and then attempt disassembling it from there. As a test program I was using the windows calculator. For some reason, my program shows me that the .text section starts with
"EA 22 DD 77 D7 23 DD 77 9A 18 DD 77 00 00 00 00 2E 1E C7 77 83 1D C7 77 FF 1E C7 77 00 00 00 00" (just a snippet of the first 32 bytes)
However OllyDbg shows me :
"1B 76 DD 77 83 78 DD 77 F0 6B DD 77 00 00 00 00 39 5E F1 77 87 5D F1 77 EB 5E F1 77 00 00 00 00" which according to OllyDbg translates to:
1B76DD77 DD ADVAPI32.RegOpenKeyExA
8378DD77 DD ADVAPI32.RegQueryValueExA
F06BDD77 DD ADVAPI32.RegCloseKey
00000000 DD 00000000
395EF177 DD GDI32.SetBkColor
875DF177 DD GDI32.SetTextColor
EB5EF177 DD GDI32.SetBkMode
00000000 DD 00000000

I know for sure that my program is reading the correct bytes (the start offset comes to 400) since I have crossed checked the file with a HexEditor.
After approximately 1230 bytes from the beginning of the .text section, Olly and my program look the same and show the same bytes.

My question thus is:
1. Is OllyDbg doing some translation in the beginning of the .text section.
2. If I want to disassemble (I have disassembler code) the .text section, where should I start from. Should I start from the "ProgramEntryPoint" instead of the start of .text or from some place else.
3. If I had a polymorphic program, then the beginning of .text would be treated on a byte by byte basis (i.e. DB - being encrypted), thus how should I differentiate between what is code and what is data ?

Thanks

ancev

May 1st, 2007, 21:03

414B,

Olly, when load the file, parse the import table... so, in disk, it point to api names and like, and in memory (after loaded by olly), it point to memory addresses and so.

vecna

414B

May 1st, 2007, 21:33

vecna,
Thanks for the reply. So, if I am attempting to disassemble the program, how would I know that a byte points to an API call or if it is an instruction.
414B

Ricardo Narvaja

May 1st, 2007, 22:37

in ollydbg, you can right click in a part of the code,click in VIEW-EXECUTABLE FILE and you see the code of the executable you have in the hard disk not modified.

ricnar

blabberer

May 3rd, 2007, 12:22

yes you have to start from peHeader->AddressofEntrypoint thats the first executable line for a normal pe file (leave apart tls , dllmain blah blah for now)

calc.exe is produced with /merge linker switch so the import table is what you will get in 0x400 (section header .text -> pointer to raw data)

try disassembling a puerly asm function (possibly iczelion tut 03 win.exe)
for starters and then compare it with ollydbg
when it comes out right switch to complex application

cyphunk

May 4th, 2007, 05:23

I found this tutorial to be pretty direct to the point and i think it will help you:
http://www.bytedevils.de/tutorials/18102004.html

it is in german. I have an english translation along with my own example application on an old webserver somewhere. If you want it, let me know. actually, i wouldn't mind if you did, would force me to dig out those resources.

dELTA

May 7th, 2007, 07:22

To sum it up, if you're trying to create a disassembler, you really must have good knowledge/control of the PE specs first.

TBone

May 14th, 2007, 16:42

Quote:

[Originally Posted by 414B;65332]For some reason, my program shows me that the .text section starts with
"EA 22 DD 77 D7 23 DD 77 9A 18 DD 77 00 00 00 00 2E 1E C7 77 83 1D C7 77 FF 1E C7 77 00 00 00 00" (just a snippet of the first 32 bytes)
However OllyDbg shows me :
"1B 76 DD 77 83 78 DD 77 F0 6B DD 77 00 00 00 00 39 5E F1 77 87 5D F1 77 EB 5E F1 77 00 00 00 00" which according to OllyDbg translates to:
1B76DD77 DD ADVAPI32.RegOpenKeyExA
...

This would appear to be a chunk of the Import Address Table (IAT). The IAT is built at run-time by the PE loader using the data in the executable's Import Descriptor Table (IDT). Any values that might be in the IAT on disk are almost certainly going to be replaced by different values at run-time. Your disassembler appears to be finding the start of .text correctly, it just doesn't contain the same values in memory as it does on disk.

Quote:

[Originally Posted by 414B;65332]My question thus is:
1. Is OllyDbg doing some translation in the beginning of the .text section.

Not exactly. Olly's comments in data sections are just best guesses. It keeps track of the entry point for API functions like RegOpenKeyExA as they are loaded. It knew that (this time) RegOpenKeyExA was at address 0x77DD761B. So when Olly sees this exact number somewhere in a data section, it guesses that this is probably a reference to RegOpenKeyExA. In this case, it's a good guess. But it's also possible (if unlikely) that this exact number could have just shown up purely by chance.

Quote:

[Originally Posted by 414B;65332]2. If I want to disassemble (I have disassembler code) the .text section, where should I start from. Should I start from the "ProgramEntryPoint" instead of the start of .text or from some place else.

For the simplest of all disassemblers, you should start from the program's entry point and starting tracing from there. This is enough to follow simple jumps and calls, but your disassembler can't follow a command like "JMP EAX" unless it knows what EAX would contain at run-time. A slightly more sophisticated disassembler will see if the value of EAX was set in the code just before the jump. Even more sophisticated disassemblers come up with other tricks to guess what EAX might contain in order to follow the flow of the program. As well, they also scan the program for sequences of bytes that are likely to represent code, even if doesn't look like the code is ever reached.

Quote:

3. If I had a polymorphic program, then the beginning of .text would be treated on a byte by byte basis (i.e. DB - being encrypted), thus how should I differentiate between what is code and what is data ?

The short version is that you can't. That's basically the point of packing/encrypting a program. You'll need to decrypt/unpack the program and *then* analyze it. Unpacking and decrypting are two of the biggest topics you'll find discussed around here.

Quote:

[Originally Posted by 414B;65332]Thanks for the reply. So, if I am attempting to disassemble the program, how would I know that a byte points to an API call or if it is an instruction.

I sort of answered this in 1) above. To get more specific, the particular references you gave in your example weren't actually API calls. They were a table of pointers to the entry points of API functions. The actual API calls might reference this table. API calls are really just ordinary instructions. If RegOpenKeyExA is at address 0x77DD761B, the call could take several forms. The simplest is a direct call:
0x40100100 | CALL 0x77DD761B

A slightly more complicated way might be a call to a jump:
0x40100100 | CALL 0x40123456
...
0x40123456 | JMP 0x77DD761B

An even more complicated way would be to use a "thunk" table (like the IAT):
0x40100100 | CALL DWORD PTR DS:[40200000]
...
0x40200000 | DD 0x77DD761B (RegOpenKeyExA)
0x40200004 | DD 0x77DD7820 (some other API function)
0x40200008 | etc.