Log in

View Full Version : Code to find IAT


svetlana57
June 9th, 2006, 11:20
I'm trying to code a proggie like ImpRec. So I need a clue about the way ImpRec finds address and size of IAT. Source on Asm or C would be appreciated. Thanks.

kittmaster
June 9th, 2006, 19:32
My understanding of it <although I could be off> is that it uses the code data pointers to memory map and then scans for length by return count error. As the user puts in the OEP entry point it maps the physical to memory space using the system DTRs.

That is my understanding of it, but again I could be wrong. I'm sure others may have a more definitive answer and correct any errors in my statements.

Why do you want to reinvent the wheel? Just curious.

Admiral
June 9th, 2006, 20:18
Hi svetlana57

I've been down this road (writing ArmInline) and have had much success (though not total success, in the case of a protected IAT).

If you can be sure that your target's IAT is intact then the algorithm is straightforward and you should, with a little brainstorming, be able to work it out yourself. If not, your work is cut out for you. Either way, I'll outline the process:

Let's suppose that the target IAT is as Windows left it, and I'll assume that you are comfortable with the Win32 debug API (in particular, ReadProcessMemory).
Now ImpRec requires OEP as input. Given this, things are easy. But even if you don't have the OEP you can still proceed effectively: All that needs to be done is to isolate the .code/.text section (parsing the PE header is a good idea) to find bounds on memory you can count on containing IAT references. If you know the OEP then this is a great place to start.

From here, the idea is to linear search (binary-compare) for the opcode for CALL DWORD PTR (which is 0xFF15) and take note of the DWORD that follows (it will be a relative pointer). I'd recommend searching for a few of these until you get a decent sample size and a very low variance (suggesting that your samples are mostly pointing at the IAT).
Now simply take the mean or any sample entry nearby (you can be surer using a least-squares or similar modal-range finding algorithm) so you can say with some confidence that you have an address that lies within the IAT.
All that's left to do is to extrapolate forwards and backwards from this point, following the rules for an IAT to find the start and end (it is well-known that the IAT ends with two nulls).

I don't have any source code to hand, but my (verbose) pseudocode should be enough for you to throw something together in minutes.

Regards
Admiral

LLXX
June 10th, 2006, 00:51
Quote:
[Originally Posted by Admiral]From here, the idea is to linear search (binary-compare) for the opcode for CALL DWORD PTR (which is 0xFF15) and take note of the DWORD that follows (it will be a relative pointer). I'd recommend searching for a few of these until you get a decent sample size and a very low variance (suggesting that your samples are mostly pointing at the IAT).
Actually, if you just try repnz scasw with ax=15FF, you'll miss approximately half of the calls that aren't word-aligned the same as edi. Better to use repnz scasb to find the FFs (possible CALLs), then dec edi and scasw when an FF is found.

The location of the IAT can be further verified by checking to see if most of the dords in the possible IAT are pointing to valid imports in the kernel32 -checking for ExitProcess address is a good choice, I haven't seen many programs (in fact I don't remember any) that don't use ExitProcess.

svetlana57
June 10th, 2006, 01:40
Ok, I understood getting the approximate address of IAT (using search pattern 0xFF15 I'll find address somewhere in the middle of IAT), but I don't know exactly, how to find exact start and end of IAT. I'm talking about protected apps. These addresses sometimes may lead to allocated memory, not straight to the kernel32 or user32, so I can't check destination to be sure, is it still IAT address or not. And end of IAT may be filled with junk sometimes and not zeroed. And one more problem, IAT addresses in some range may be diluted with junk, is it possible to differentiate them?

Admiral
June 10th, 2006, 08:34
Well now you have a tricky task ahead of you.

ImpRec does a sterling job of bounding an intact IAT but if you feed it something a bit less linear (such as Armadillo's Import Elimination) it will fail.

It's up to you just how much effort you want to put into this. A good start would be to enumerate the target's memory pages. A reasonably quick method from here would be to scan from your know IAT point (forwards and backwards, as before, up to a null pair failsafe or the page boundary if that's safe) to find the first and last dwords that lie within space you know to be allocated to some DLL. From here, you just need to fine-tune the ends outwards. Most of the packers I've seen keep their false IAT entries within a contiguous memory block, so one idea would be to, again, statistically analyse the IAT entries to find the modal range of invalid dword pointers. This will likely be the area containing the false entries and hence can be whitelisted as valid DLL space.
The way in which you proceed is highly target-dependent, but it's generally better to play it safe and allow a few false positives to maximise your genuines (as a well protected IAT will have a load of bad entries within its bounds anyway).

So I guess the answer is no, there is no surefire method if your IAT entries aren't all pointing to real DLL functions. This assumes you are counting false entries as valid entries. If not:

If efficiency isn't a huge concern, a good way to ensure you catch all real DLL thunks would be to create a list of the locations of every possible DLL function address:
Iterate through the address space, step aligned to 0x1000 (which is where all PE headers must be loaded by Windows' orders) and check the beginning of each page (if it exists) against the PE header signature. Each time you find one, parse what's needed of the header to find the location of the relevant disk image. Once you have the filenames of every DLL loaded (this will include both runtime and compile-time loaded DLLs) you should LoadLibraryA each one in turn and parse the exports (on both names and ordinals) to create an array with the address of each and every function in the DLL. Now perform the necessary offset calulation so that these addresses are valid within the remote process's address space and you will have a complete list of every valid IAT dword (excluding false entries). Your job is a great deal easier from here.

Of course, this is a desperate measure - it's not dead simple to implement and it can take a very long time to run. But if your intentions are to recreate an import table a la ImpRec then many of these steps are necessary anyway.

One point that I neglected to mention in my last post: If you are using a mean to find a safe IAT point, make sure it's aligned correctly with the dwords. Although not a necessary restriction, I've not seen a protector that has IAT entries that aren't 4-aligned in memory, so removing the modulo 4 from your mean should be safe.

Regards
Admiral

upb
June 12th, 2006, 03:51
Quote:
[Originally Posted by Admiral]From here, the idea is to linear search (binary-compare) for the opcode for CALL DWORD PTR (which is 0xFF15) and take note of the DWORD that follows (it will be a relative pointer).


Hmm? nope, it will be a VA

Nacho_dj
June 12th, 2006, 06:37
Hello:

Don't forget to search for another references to IAT like this:
MOV EDX, [632400]
CALL EDX

In this case the binary code to search is not 0xFF15, but 0x8B15.

Have a look at the codes I search in an embedded Import Rebuilder of an unpacker I have coded:
Code:
0xFF15 CALL DWORD []
0xFF25 JMP DWORD []
0xFF35 PUSH DWORD []
0x8B0D MOV ECX,DWORD []
0x8B15 MOV EDX,DWORD []
0x8B1D MOV EBX,DWORD []
0x8B25 MOV ESP,DWORD []
0x8B2D MOV EBP,DWORD []
0x8B35 MOV ESI,DWORD []
0x8B3D MOV EDI,DWORD []
0xA1 MOV EAX,DWORD []
0x3B05 CMP EAX,DWORD []
0x3B1D CMP EBX,DWORD []
0x3B0D CMP ECX,DWORD []
0x3B15 CMP EDX,DWORD []
0x3B25 CMP ESP,DWORD []
0x3B35 CMP ESI,DWORD []
0x3B3D CMP ESI,DWORD []


All are valid references to the IAT...

I hope this helps...

Nacho_dj

LLXX
June 12th, 2006, 23:05
@nacho_dj: what are the CMPs doing in the list of possible import table calls? I've never seen a single code example that called through that way.

You also seem to have forgotten the direct call xxxxxxxx (E8), which is used by a few packers as well (y0da packer being the first example I can think of, having unpacked one a few hours ago).

Nacho_dj
June 13th, 2006, 02:00
LLXX I'd like to know why those CMP's are used so, too, but I have found that use in certain target, sorry by not remembering what target was using it.

I know it was not because of the protection, it was found having unprotected the wrapper, that is, in a clean code. Maybe it is a compiler whim...

And yes, it seems an interesting issue that direct call (E8) that you mentioned.
However, I see there is a little problem: if a direct call is not using a pointer of an API, it is expected that the place signed should contain some code that gives you the access to an API, instead of a pointer to an API, as IAT works. So, how could that direct call (E8) be used by the target to get the API it is needed?

Cheers

Nacho_dj