Unknown packer and how do generic OEP finders work??? [Archive]

View Full Version : Unknown packer and how do generic OEP finders work???

yaa

May 9th, 2004, 13:01

Hello,

I casually discovered a delphi application to be packed by an unknown packer. All file analyzers I utilized simply detected the application as a delphi application. Anyhow PEiD's generic unpacker plugin succeeded in correctly identifying the OEP and unpacking it (on w9x systems also Christoph Gabler's old Startup Code Unpacker was succesfull in unpacking the application). Although I can now crack it with no problems I'd like to understand more about what I've faced.

The unpacked application has an additional PE section (named .snaker). Also, the packer loader checks for the presence of a debugger (IsDebuggerPresent is called) and has code that checks for the integrity of wide areas of the loader's own code. The function that does the integrity checking is the following:

Code:

00809E20  /$  PUSH EBX

00809E21  |.  PUSH ESI

00809E22  |.  PUSH EDI

00809E23  |.  TEST EDX,EDX

00809E25  |.  JBE SHORT Xselerat.00809E6E

00809E27  |.  MOV EBX,ECX

00809E29  |.  AND EBX,0FFFF

00809E2F  |.  MOV ESI,ECX

00809E31  |.  SHR ESI,10

00809E34  |.  AND ESI,0FFFF

00809E3A  |.  MOV EDI,EDX                      ; EDX contains the number of bytes

00809E3C  |.  DEC EDI

00809E3D  |.  TEST EDI,EDI

00809E3F  |.  JL SHORT Xselerat.00809E65

00809E41  |.  INC EDI

00809E42  |.  MOV ECX,EAX                      ; EAX contains the start address

00809E44  |>  /XOR EAX,EAX

00809E46  |.  |MOV AL,BYTE PTR DS:[ECX]

00809E48  |.  |ADD EAX,EBX

00809E4A  |.  |MOV EBX,0FFF1

00809E4F  |.  |CDQ

00809E50  |.  |IDIV EBX

00809E52  |.  |MOV EBX,EDX

00809E54  |.  |LEA EAX,DWORD PTR DS:[EBX+ESI]

00809E57  |.  |MOV ESI,0FFF1

00809E5C  |.  |CDQ

00809E5D  |.  |IDIV ESI

00809E5F  |.  |MOV ESI,EDX

00809E61  |.  |INC ECX

00809E62  |.  |DEC EDI                         ; exit the loop when EDI is zero

00809E63  |.^ \JNZ SHORT Xselerat.00809E44

00809E65  |>  MOV EAX,ESI

00809E67  |.  SHL EAX,10

00809E6A  |.  ADD EAX,EBX

00809E6C  |.  JMP SHORT Xselerat.00809E70

00809E6E  |>  MOV EAX,ECX

00809E70  |>  POP EDI

00809E71  |.  POP ESI

00809E72  |.  POP EBX

00809E73  \.  RETN

Has anyone got any idea about what packer this may be???

But at this point I'm indeed very curious about generic OEP finders (I suppose that generic unpackers simply dump the application on the OEP and try fixing the IAT). I wonder how do they work? What common property among packers do they take advantage of to identify the OEP? Any insight into this most fascinating question is most welcome.

Thx.

yaa

nikolatesla20

May 10th, 2004, 09:35

One of the most common things a generic unpacker can look for is code execution that finally reaches the code section.

For example "most" packers put there unpacking code in higher memory. It has to, since the real code of the original program can't be moved in memory (not unless they patched every single address reference). The compiler put the imagebase address at a specific point in memory and without a relocation section (which most EXE's lack) the program code HAS to stay at this imagebase. (DLL's can move around).

Anyway, all a generic unpacker really has to do is look for code execution that falls within the first or second EXE section. Most packer code is definitely going to be higher in mem than this, and the packer code is what starts first, so that means that eventually a basic packer is going to have to unpack the program, and jump to the original code. It's when this code starts that you know you are probably at the OEP.

This is kinda how the "trace" in some unpackers works. Like Revirgin's trace.

-nt20

yaa

May 10th, 2004, 11:32

nikolatesla20,

your explaination about packer's loader code usually having to work at high addresses is illuminating indeed. I never thought of it. How idiot I am.

But I wonder, couldn't packers steal a few of the app startup instructions (moving just a few instructions and eventually recalculating a few relative addresses isn't a big deal) and move them in higher addresses??? Or viceversa, couldn't they find a few holes in the original application code and use those hole for their own code??? This would probably fool any generic unpacker unless the unpacker has been coded to look for well known function prologues ... but then these vary depending on the language and compiler used .. and probably even compiler settings (speed/size optimizations).

yaa

nikolatesla20

May 10th, 2004, 12:05

Yes some do steal some of the starting code , this is known as "stolen bytes" in the ASprotect packer world.

However, you will find thru exerience in unpacking that a large amount of the time stolen bytes are actually uneccasary for the program to run. You can usually get away without having them.

Other programs, more advanced protectors, do replace instructions in the original program. Actually most of the time, they simply make a new import table with addresses that point to the protector's code. There are very few protections which actually go to the trouble of modding the code itself in the program (SafeDisc is one that does).

Since the code eventually has to reach the first section, it's pretty tough to "disguise" it's execution. SO most packers try to instead prevent debuggers, etc, from watching what they are doing.

Another unique solution that a protection known as SoftProtect does, which is very original, is it scans for RET instructions in the target when it's packing it. It then puts addresses of these instructions into a table, and when it comes time to "jump" to the OEP, it simply does a type of "stack unwind" - it pushes the beginning of the table unto the stack, and does a RET. This causees the processor to jump to the first address listed at the beginning of the table. And where does this address lead? Why , to another RET of course, which causes the processor to again pop off the next address off the stack and once again it finds another RET, and again, until it reaches the end of the table, and lands on the OEP address. This is very effective at preventing tracers, since the amount of RET's could be very large, and there will be numerous ones that get executed in the code section before actually reaching OEP. It is a very unique technique and I praise the author for coming up with it. However, it's weakness is the table. If you can find the table, you have the OEP, since it's at the end of the table! How do you know you are at the end of the table? Simple, you just compare each address in the table with the range of the code section. If it's still in range, it's a valid address, if not, you've reached the end of the table, and the last address you looked at was the OEP. So all you really need to do is find the beginning of this address-OEP-stack-unwind table.

So basically each protector has its own unique ways of trying to hide what it does, but the same as always, if it executes, you can watch/modify it. Just have to find what to look for.

-nt20

yaa

May 10th, 2004, 13:13

But I wonder, how are imported function addresses updated???
I mean does the OS loader update the IAT even for the "wrapped" application imported functions or is the packer loader responsible for this task???

This is a point I've never been able to clarify.

yaa

nikolatesla20

May 10th, 2004, 16:05

The packer loader is responsible to do this job.

-nt20

yaa

May 10th, 2004, 17:30

But is this because even the original IAT gets packed? Because I suppose that otherwise the OS loader would update it .... or am I mistaken?

yaa

doug

May 10th, 2004, 20:58

Quote:

[Originally Posted by yaa]But is this because even the original IAT gets packed? Because I suppose that otherwise the OS loader would update it .... or am I mistaken?

yaa

more often than not, the imports are partially (or completly) wiped and rebuilt through various techniques (and this is generally what characterizes the packers) at run-time, usually in the last few steps of the packer, before it passes control back to the app.

%UNDEFINED%

May 10th, 2004, 21:00

Basically the original IAT is stripped, although this is not true of all packers.
Most encryptors designed to hinder anyone from unpacking them, use API redirection and IAT emulation.
While I do not know how to exactly explain how it works, I know what it does.
From my experience, I have seen that the cryptors will remove all of the imports except one from each different DLL.
That is what is left of the original.
Based on that data the cryptor will find all of the needed Thunks, and place them elsewhere in memory and build a temporary IAT.
All API calls will go through this.
But it won't be dumped with the normal code/data/rsrc, and even if it was, its not a usable IAT.
This prevents the program from running if its dumped since no useable import table exists

yaa

May 11th, 2004, 03:46

UNDEFINED,

if I understood correctly what you mean by API redirection and IAT emulation, to work either:

(1) all calls to APIs made in the application have their relative addresses corrected to point to the new programmatically built thunk of correct API addresses (your emulated IAT). In this case the original IAT isn't at all used.

or:

(2) the real IAT is used as a thunk, filling it with the addresses that lead to the real thunk of correct API addresses (your emulated IAT). This last solution would save the packed from having to correct all API calls made thru the application since they already point to the original IAT.

yaa

%UNDEFINED%

May 11th, 2004, 21:32

Basically the second one, if your interested in some detailed reading, take a look at this essay by AndreaGeddon.

They go into great detail about how the import table is "rebuilt"

nikolatesla20

May 11th, 2004, 22:01

Quote:

[Originally Posted by yaa]UNDEFINED,

if I understood correctly what you mean by API redirection and IAT emulation, to work either:

(1) all calls to APIs made in the application have their relative addresses corrected to point to the new programmatically built thunk of correct API addresses (your emulated IAT). In this case the original IAT isn't at all used.

or:

(2) the real IAT is used as a thunk, filling it with the addresses that lead to the real thunk of correct API addresses (your emulated IAT). This last solution would save the packed from having to correct all API calls made thru the application since they already point to the original IAT.

yaa

The second one is usually the case. The reason is because a program usually has two "thunk tables". The "OriginalFirstThunk" and the "FirstThunk". The FirstThunk table is the actual table that the code , such as Call Dword Ptr[xxxxxxxx] instructions, which call into the API, point to (the xxxxxxxx is an address in the table). The FirstThunk table MUST be present then, because all code depends on it. The OriginalFirstThunk table is a backup of the FirstThunk table in case the FirstThunks are missing.

The O.S. Loader looks at the ImportTable and fills in the addresses of each particular DLL into the FirstThunk table. If the ImportTable does not have any entries, the O.S. loader will FAIL to load the program, it will say something like "Program initialzation failed". This is why packers have to leave at least one API call for each DLL that is imported, in the FirstThunk table - it's so the O.S. doesn't crap out. Some protections, like Armadillo, replace the entire Import Table (by Import table I mean the table which lists DLL names and DLL function names, not the actual table of addresses to functions - the table of addresses to functions is the "FirstThunK" table). BUT they still must always patch the DLL function addresses into the FirstThunk table. The FirstThunks cannot be moved (not unless you have a relocations section which tells the O.S. loader how to move it). Hence this is the greatest weakness of IAT redirection. One can easily find where the FirstThunks are and work on the addresses from there , usually some will be valid and point to real API calls, and others will point to redirected protection calls.

It really pays to study up on Import table structures, it makes unpacking 100% easier to understand.

-nt20

yaa

May 12th, 2004, 07:58

As I see it, you could leave the FirstThunk there so that the OS loader will not complain and simply redirect all the CALL DWORD PTR [xxxxxxxx] instructions to point to a completely different addresses table.

I will read Andrea's tut, it seems just what I need to strengthen my understanding of how packers work.

yaa

nikolatesla20

May 12th, 2004, 11:30

Quote:

[Originally Posted by yaa]As I see it, you could leave the FirstThunk there so that the OS loader will not complain and simply redirect all the CALL DWORD PTR [xxxxxxxx] instructions to point to a completely different addresses table.

I will read Andrea's tut, it seems just what I need to strengthen my understanding of how packers work.

yaa

Yeah you could change the CALL's but it's a lot more work, and you have to make sure you don't miss one of them, or write over a CALL that isn't an API call. To do that you have to cross-reference every CALL with an API to see if it goes to the import table. It's possible but more complicated, that's why most protectors don't do it. Like I said, SafeDisc is one that DOES do this. The other reason is there might be a JMP table instead of calls. In either case, you have to most likely use a Disassembly engine to do it right, Disassemble the code, search for CALL DWORD PTR strings and check the address that follows to see if it's in the Import table region. Then you know you can patch it. So most protectionists are just too lazy / and / or stupid to add a Disassembly engine.

In the near future, my unpackers will contain a disassembly engine so they can more effectively find code sequences (Including OEP code). For example, some code sequences that do the same thing can actually be opcoded in different ways. All you really need to do is look for a "template". Not even exact instructions, just things like "look for this, does this item come after it, and then this next item is in there too? And in the middle is this item between them all? If so, this is probably the right spot..". So it would be opcode independent, instead it searches for actual code operations. Combine a disassembly with a regular expression engine and you have a powerful tool on your hands IMO. Someone needs to make this, I think I will sometime in my spare time.

-nt20

yaa

May 12th, 2004, 12:21

Quote:

[Originally Posted by nikolatesla20]The other reason is there might be a JMP table instead of calls.

Yes, I've seen that Borland compilers generate such jump tables.
It isn't even clear to me why they are used since they could just generate a direct jump to the dll function start instruction. A useless jump or anyhow one that could have been saved.

yaa

Dr.Golova

June 3rd, 2004, 15:06

Quote:

[Originally Posted by yaa]Yes, I've seen that Borland compilers generate such jump tables.
It isn't even clear to me why they are used since they could just generate a direct jump to the dll function start instruction. A useless jump or anyhow one that could have been saved.
yaa

Borland use OMF object file, and this obj don't contain info about external symbols. For example linker start generate exe file from several obj's. In firs obj it fount reference (i.e. call) to some external symbol "my_cool_proc", but linker don't know is this symbol some function from dll or it's some public function from another obj file. So linker remember addres of this reference and if found this symbol in next obj, link this call to function, if all obj linked it start scan import libraries, add jmp d,[xxxx] table for each needed symbol and link references to this table. Ofcource, compiler (obj generater) also used in this shit - if compiler cnow this symbol is in dll it can generate direct call d,[xxxx] as modern compilers do.