Log in

View Full Version : YAPEQ - "Yet Another PE Question"


Silver
September 23rd, 2004, 08:34
Okay, so I've come across another conceptual problem I have with the PE structure. Not sure why I didn't notice this before, but anyway.

I was unpacking a target packed with a slightly mangled UPX 1.25 (-d wouldn't unpack it), but ran into some problems. So I grabbed upx.exe and upx'd notepad to test the process (I've successfully unpacked upx manually before, so I figure I've just forgotten something). Long story short, I dumped and fixed it without problems, and used Imprec to rebuild the IAT.

When I looked at the PE directory info, I noticed something I don't understand (from procdump PE editor, directory structures editor):

Original notepad:
Import table RVA: 6D20 Size: C8
IAT RVA: 1000 Size: 324

Dumped & Imprec fixed notepad:
Import table RVA: 1D000 Size: B4
IAT RVA: 0 Size: 0

I understand the Import table RVA/size is different due to a new section by imprec, that's fine. What I don't understand is how & why the dumped/fixed notepad still works when the RVA for the IAT isn't set?

I've read a number of PE tutorials & docs, and they don't explain this very clearly (ie: the difference/use/function of the import table entry and the IAT entry in the PE structure). Quote from http://msdn.microsoft.com/msdnmag/issues/02/03/PE2/default.aspx

Quote:
The anchor of the imports data is the IMAGE_IMPORT_DESCRIPTOR structure. The DataDirectory entry for imports points to an array of these structures. There's one IMAGE_IMPORT_DESCRIPTOR for each imported executable. The end of the IMAGE_IMPORT_DESCRIPTOR array is indicated by an entry with fields all set to 0. Figure 5 shows the contents of an IMAGE_IMPORT_DESCRIPTOR. Each IMAGE_IMPORT_DESCRIPTOR typically points to two essentially identical arrays. These arrays have been called by several names, but the two most common names are the Import Address Table (IAT) and the Import Name Table (INT). Figure 6 shows an executable importing some APIs from USER32.DLL.


Okay, so that says to me that the import table directory entry in the PE header points to a location within the exe that contains 2 arrays - the IAT and the INT. Fine. But if I read part 1 of that essay, I see the following:

Quote:
IMAGE_DIRECTORY_ENTRY_IAT - Points to the beginning of the first Import Address Table (IAT). The IATs for each imported DLL appear sequentially in memory. The Size field indicates the total size of all the IATs. The loader uses this address and size to temporarily mark the IATs as read-write during import resolution.


Okay, so that says that the IAT directory entry points to the start of the first IAT (which up until now I thought was exactly how this worked).

The original notepad PE has values for both import table and IAT in the directory. The dumped version has no value set for the IAT in the directory, yet it works.

How can the dumped Notepad work without an RVA for the IAT? What am I missing here?

doug
September 23rd, 2004, 08:54
This might not be the kind of complete answer you are expecting, but...

I can't think of a situation where these fields are required.

This is just pointing to the start of the first IAT. What about the others? where does the loader get them? in the IMAGE_IMPORT_DESCRIPTOR. So if it can get the others' IAT from there, there's no reason to have a special case for the "first IAT", it just grabs it from the IID as well.

the loader just does everything from IMAGE_IMPORT_DESCRIPTOR.

Even though most compiled PE files have all their IATs (i.e: for all imported DLLs) in a sequential way. You can build a perfectly valid PE valid with IATs scattered all over the file; as long as each imported DLL's IAT are in 1 sequential block (and 4 bytes-NULL terminated) AND that your IID is correctly built.

I've never bothered with these two PE header fields (rva&size). The way I see it, these fields are redundant. I don't think there is a rationale behind this. performance? legacy feature?

Silver
September 24th, 2004, 05:45
Quote:
I've never bothered with these two PE header fields (rva&size)


For the IAT directory entry, you mean? So in other words, the IAT directory entry is irrelevant if the import table entry points to a valid IID array.

Interesting. Anyone else have any further thoughts on this?

doug
September 24th, 2004, 13:23
yes, when I have to, I rebuild a proper IID and set the new rva&size in the pe header.

as for the IAT directory entry, I usually reset rva&size to 0.
--
Quote:

So in other words, the IAT directory entry is irrelevant if the import table entry points to a valid IID array.

Yes, but since the IID has to be valid if you want the PE to be loaded at all; the IAT directory seems to always be irrelevant.

I did not look at the w2k source code, but I think the code for the pe loader is in there. Perhaps a better answer can be found there. Try grepping the loader source code with the IAT directory entry symbolic constants.

Silver
September 25th, 2004, 09:52
Thanks for info doug.

homersux
September 27th, 2004, 18:05
Matt Pietrek in his MSJ columns have described PE structure extensively. IAT is completely irrelevant on disk. It only matters when the loader loads the PE file into memory and constructs the executible image.

nikolatesla20
September 27th, 2004, 20:54
Exactly, IAT RVA is not used by the loader. Most likely it was to be used by debuggers, etc.

-nt20

user
September 27th, 2004, 21:21
This is not right. The loader does use those information if bound import directory exists. I'll let you try to find the explanation as an exercice , I can't be arsed to explain as you all seemed agreeing on the question. Here is a hint: it's linked with speed improvement.

homersux
September 28th, 2004, 09:04
hi, user, what you said is correct, but binding is a somewhat ancient technique now and most people will tell you it's not recommended to sacrifice compatibility for the little speed gain. The binding technique were mostly used in old times where loading time is critical on slow computers or during system boot up by windows. It's not surprising that binding is mostly used by windows loading process. I'd say 99.94% PEs are not bound therefore our statement is practically correct.

Silver
September 28th, 2004, 12:47
homersux, the article I linked to in my first post is the Matt Pietrek article, so unless you mean a different one then I stand by my original assertion that it's not very clear on the subject. IAT is of course irrelevant on disk, but I believed that the directory entry for it was not.

Thanks for the replies.

user
September 28th, 2004, 18:48
sacrifice compatibility? Hahaha. Either you have a poor understanding on how it works or ur words went a bit further than your thoughts (I'll go for second one).

First of all it's not ancient, check your calc.exe, notepad.exe and you will notice it's still in use. U sacrifice nothing by doing so, u add extra informations that will help the import resolver to speed up if DLL versions matches. I understand why you'd say that but it's innacurrate, it's enough for a newbie to know for sure. But frankly even if it was the case, practically true is still not true in my book.

In the field, for a normal application if it has to be run on multiple of version, it is for sure a waste of time (even though not a problem for 100% correct execution), BUT if you have an application targetted at a specific SP or build, it's worth the effort.

homersux
October 3rd, 2004, 13:47
user, your examples is an annotation of my statement, binding is mostly only used by MS programs to speed up loading time. Most PEs not produced by MS don't use this technique.

Silver, maybe I missed your link, but it'd be the link I'd recommend to anyone who has a PE question.

user
October 3rd, 2004, 20:24
Homersux: I was merely correcting your statements regarding the IAT usage, and the compatibility argument you and other advanced. It's not an annotation, it's a correction/precision. Dispatching incorrect/incomplete information is source of a lot of errors. I personally don't care on the matter itself, feel free to consider it ancient and a source of compatibility problem.

Silver: Anyway, the loader doesn't use the IAT area, it will overwrite it (if no bound import exists). Check a borland generated EXE, and you will see there is no IAT dir. You will notice too that IMAGE_IMPORT_DESCRIPTOR.OriginalFirstThunk is NULL and only FirstThunk is used. FirstThunk acts there as both name/ord table and IAT. The minimum is really having FirstThunk field filled with infos.