Log in

View Full Version : Malware and initial stack pointer value


ZaiRoN
November 25th, 2008, 10:17
An old blog entry I forgot to import from my blog at Wordpress...


Here are the first lines of a malware I was looking at some days ago (MD5: DA4B7EF93C588AD799F1A1C5AFB6CFAD). The malware is packed, I think with an home made packer; 40107C is the entry point, the first line of the loader’s code. The code is filled with useless instructions, nothing hard but if you want to study the entire loader you have to pay attention on every single lines of code. This time I’m not interested in the loader itself, but I’ll focus my attention on a strange behaviour, something I have never noticed before. The malware crashes at 4010AC on XP sp3 machine but it works fine on XP with service pack 1 or 2.
Code:
40107C ADD ECX,DWORD PTR SS:[ESP] ; useless
40107F MOV ESI,-70 ; useless
401084 ADD EDI,EAX ; useless
401086 MOV ECX,2AFFC5C8 ; useless
40108B ROL ECX,1 ; useless
40108E ROR EDX,15 ; useless
401091 MOV EDI,ESP ; edi = 12FFC4
401093 MOV EDX,FE000001 ; edx = 0xFE000001
401098 ROL EDX,7 ; edx = 0xFF
40109B SUB EAX,EBX ; useless
40109D AND EDI,EDX ; edi = 0x12FFC4 && 0xFF = 0xC4
40109F MOV EDX,25FE0 ; edx = 0x25FE0
4010A4 ROL EDX,3 ; edx = 0x12FF00
4010A7 ADD EDX,EDI ; edx = 0x12FFC4
4010A9 SAL ECX,11 ; useless
4010AC MOV EAX,DWORD PTR DS:[EDX] ; eax = 0x77E5EB69

The comments are taken from a XP sp1 debugging session. At the end of the snippet eax points to ExitThread’s parameter, the one inside BaseProcessStart. There’s nothing interesting in these few lines of code, but it’s always better to open your eyes when there are hardcoded values around. I’m referring to value 0×12FF00 (hardcoded is not totally right but the sense is the same). It’s not obvious but this piece of code could not work on every single machine. Seems like the author was sure about the initial stack address value. I don’t know when the malware was written, but this piece of code crashes on XP machine with Service Pack 3. Maybe the malware was written before the final release of the latest service pack, I dont know. Here is the same code tested on a machine running XP sp3 :
Code:
401091 MOV EDI,ESP ; edi = 13FFC4
401093 MOV EDX,FE000001 ; edx = 0xFE000001
401098 ROL EDX,7 ; edx = 0xFF
40109D AND EDI,EDX ; edi = 0x13FFC4 && 0xFF = 0xC4
40109F MOV EDX,25FE0 ; edx = 0x25FE0
4010A4 ROL EDX,3 ; edx = 0x12FF00
4010A7 ADD EDX,EDI ; edx = 0x12FFC4
4010AC MOV EAX,DWORD PTR DS:[EDX] ; CRASH!!!

The initial stack address is not the same, this time it’s 0×13FFC4. The malware was expecting to see 0×12FFC4, but the value it was looking for is stored inside 0×13FFC4 address.

Who decide which kind of value should be assigned to esp? 12FFC4 or 13FFC4?
My investigation started from kernel32.CreateProcessInternalW function. All the code refers to a XP sp3 machine, but sp1 code is almost equal.
Code:
7C819DE1 mov eax, [ebp+MaximumStackSize]
7C819DE7 lea ecx, [ebp+InitialTEB]
7C819DED push ecx ; InitialTEB
7C819DEE push eax ; MaximumStackSize
7C819DEF push [ebp+StackSize] ; StackSize
7C819DF5 push [ebp+hProcess] ; hProcess
7C819DFB call _BaseCreateStack@16 ; BaseCreateStack(x,x,x,x)
7C819E00 mov [ebp+var_9EC], eax ; eax = 0 means SUCCESS
7C819E06 cmp eax, ebx ; ebx = 0
7C819E08 jl _BaseSetLastNTError ; Jump to error check
7C819E0E push ebx ; NULL
7C819E0F push [ebp+InitialSP] ; Stack pointer
7C819E15 push [ebp+InitialPC] ; Program counter
7C819E1B push [ebp+Parameter] ; Parameter
7C819E21 lea eax, [ebp+Context]
7C819E27 push eax ; Context
7C819E28 call _BaseInitializeContext@20 ; BaseInitializeContext(x,x,x,x,x)

This is where the new process’s context will be initialized. This is only an initialization, you won’t see the final values (values at EP of the new process) of each register, but it’s enough to understand why the esp values are not equal.
There are two functions in the snippet above, BaseCreateStack is used to create a stack for the process to run. BaseInitializeContext, as suggested by the name, initializes the context structure using some values obtained by the previous function. Let’s start with the first one: BaseCreateStack.
Firstly, it checks two values: MaximumStackSize and StackSize. Both of them are loaded from the process to run using NtQuerySection. Among all the information of a PE header there are two fields named SizeOfStackReserve and SizeOfStackCommit that are taken and saved by the system as MaximumStackSize and StackSize. Msdn has a description of the fields:
SizeOfStackReserve: the number of bytes to reserve for the stack. Only the memory specified by the SizeOfStackCommit member is committed at load time; the rest is made available one page at a time until this reserve size is reached.
SizeOfStackCommit: the number of bytes to commit for the stack.
Ok, now the system is going to check if they are valid or not:
Code:
7C8102B5 mov eax, large fs:18h ; eax = TEB
7C8102BB mov ecx, [eax+30h] ; ecx = PEB
...
7C8102D2 push dword ptr [ecx+8] ; PEB->ImageBaseAddress
...
7C8102DB call ds:__imp__RtlImageNtHeader@4 ; RtlImageNtHeader(x)
7C8102E1 test eax, eax
7C8102E3 jz failure
7C8102E9 mov ecx, [ebp+MaximumStackSize]
7C8102EC test ecx, ecx ; is MaximumStackSize zero?
7C8102EE mov edx, [eax+IMAGE_NT_HEADERS.OptionalHeader.SizeOfStackCommit]
7C8102F1 jnz short MaximumStackSize_not_zero
7C8102F3 mov ecx, [eax+IMAGE_NT_HEADERS.OptionalHeader.SizeOfStackReserve]
7C8102F6 mov [ebp+MaximumStackSize], ecx

If MaximumStackSize has a not zero value the flow goes on otherwise it’s necessary to set a value to this variable. Which is this value? It’s the one taken from the process’s PE header pointed by PEB->ImageBaseAddress.
Ok, now it’s time for a check over the other variable; the check is pretty similar to the previous one:
Code:
7C8102F9 MaximumStackSize_not_zero:
7C8102F9 mov eax, [ebp+StackSize]
7C8102FC test eax, eax ; Is StackSize zero?
7C8102FE push edi
7C8102FF mov edi, 0FFF00000h
7C810304 jnz StackSize_not_zero
7C81030A mov eax, edx
...

If StackSize is zero the content of the variable is filled with the value taken some lines above at 7C8102EE: SizeOfStackCommit. It’s almost the same check I described for MaximumStackSize.
If the values are not zero, the system checks them again, just to be sure they are valid:
Code:
7C80AFC2 cmp eax, ecx ; compare between StackSize and MaximumStackSize
7C80AFC4 jb loc_7C81030C
7C80AFCA lea ecx, [eax+0FFFFFh] ;
7C80AFD0 and ecx, edi ; fix MaximumStackSize
7C80AFD2 mov [ebp+MaximumStackSize], ecx ;
7C80AFD5 jmp loc_7C81030C

StackSize must be minor than MaximumStackSize, if it doesn’t happen the system raise up MaximumStackSize. Now that the initial check is complete the function proceeds working on some alignment stuff, not so interesting per se. I can pass over this part reaching an interesting snippet:
Code:
7C81036F mov ebx, ds:__imp__NtAllocateVirtualMemory@24 ; NtAllocateVirtualMemory(x,x,x,x,x,x)

...
7C81037A push PAGE_READWRITE ; Protect: PAGE_READ_WRITE
...
7C810380 push MEM_RESERVE ; AllocationType: MEM_RESERVE
7C810385 lea eax, [ebp+MaximumStackSize]
7C810388 push eax ; RegionSize = MaximumStackSize
7C810389 push 0 ; ZeroBits = 0
7C81038B lea eax, [ebp+_BaseAddress]
7C81038E push eax ; BaseAddress = 0;
7C81038F push [ebp+hProcess] ; ProcessHandle
7C810392 mov [ebp+MaximumStackSize], ecx
7C810395 call ebx ; NtAllocateVirtualMemory

The system reserves the right address space for the stack. It reserves MaximumStackSize bytes starting from an address chosen by the system; the address is the first available address inside the virtual space. The chosen address is stored inside BaseAddress and it’s used to update the content of InitialTeb->StackAllocationBase field:

Code:
7C81039F mov edi, [ebp+InitialTEB]
7C8103A2 mov ecx, [ebp+_BaseAddress]
7C8103A5 mov eax, [ebp+MaximumStackSize]
7C8103A8 and [edi+INITIAL_TEB.PreviousStackBase], 0
7C8103AB and [edi+INITIAL_TEB.PreviousStackLimit], 0
7C8103AF mov [edi+INITIAL_TEB.AllocateStackBase], ecx

The stack is created, there are 3 fields to set and for now the system updates the bottom of the stack only.
Code:
7C8103B2 add ecx, eax
7C8103B4 mov [edi+INITIAL_TEB.StackBase], ecx

InitialTeb->StackBase = BaseAddress + MaximumStackSize
The system sets up the stack area by giving the upper and lower bound. The initial stack value is StackBase and it will decrease everytime a push/call/.. occours.
The procedure goes on committing the initial area of the stack, and after that BaseInitializeContext fixes the righ values for the registers (including esp). No need to continue stepping the code, I have a lot information now, and I might come to a conclusion.

PE fields:
SizeOfStackReserve: 0×100000
SizeOfStackCommit: 0×1000

Under XP sp3:
AllocateStackBase = 0×40000
MaximumStackSize = 0×100000
StackBase = 0×140000

Under XP sp1/sp2:
AllocateStackBase = 0×30000
MaximumStackSize = 0×100000
StackBase = 0×130000

It’s impossible for sp_1/2 to have an esp value like 0×13FFC4 because the upper bound (StackBase) is 0×130000. StackBase was obtained by the operation “AllocateStackBase + MaximumStackSize” (AllocateStackBase is the same as BaseAddress value). MaximumStackSize was taken from the malware’s header, and AllocateStackBase was initialized from NtAllocateVirtualMemory call.
Seems like the solution to the puzzle comes from NtAllocateVirtualMemory. The function is called using zero as BaseAddress parameter; as I said before it means that the system decides to assign the first free virtual location which is obviously 0×40000 under sp_3 and 0×30000 under sp_1/2. From my sp_3 machine, trying to browse the memory I noticed a 0×1000 bytes allocated starting from 0×3000, there’s no trace about this memory area in old XP service packs… What did they change in XP sp3? Well, I’m ready for a vacation in Holland for now. I’ll try to reply when I’ll be back in two weeks. If the answer is obvious and/or you know why… feel free to comment your idea

Is it possible to solve the problem?
Well, it’s insane to fix a malware just to be sure to run it under an XP sp3 machine. Anyway it’s not hard to make it runnable, you can simply change SizeOfStackReserve and/or SizeOfStackCommit directly from the PE header. I tried changing SizeOfStackReserve from 0×100000 to 0xF0000 and I got a runnable file. I don’t know how safe is to change such parameters…
All the tests were done on my personal machines, I would like to know if your sp3 machine (or any other OS) has the same initial stack value.

Kayaker
November 29th, 2008, 23:26
Hey ZaiRoN,

Interesting research as usual. I was rereading this and realized I had just seen something very similar, code relying on the initial stack pointer (or at least the contents of) to operate.

In fact, it was in the "lil malware unpacking contest" malware that evaluator had posted

http://www.woodmann.com/forum/showthread.php?t=12201

I mentioned in that thread that it was necessary for the PE to have been loaded through the normal kernel32!BaseProcessStart execution path (as opposed to a custom loader), because the correct stack return value was required in the code.


Here is the final step of PE loading in kernel32!BaseProcessStart. When the Entry Point of the PE is called by
7C816D4C call dword ptr [ebp+8]
the initial stack pointer will be 12FFC4 (for XPsp2), which is as Zairon noted here. The contents of that stack pointer will be the return address 7C816D4F.

Take note of the 3 nops (90h), they will come into play later. In a very convoluted way, the malware checks if one of these nops is found through the stack pointer.

Code:

:7C816D2C BaseProcessStart:
...
:7C816D46 call ds:NtSetInformationThread
:7C816D4C call dword ptr [ebp+8] // Entry Point of PE
:7C816D4F push eax
:7C816D50 call ExitThread
:7C816D55 nop
:7C816D56 nop
:7C816D57 nop



Taking a look at the malware code

Code:

:00401133 public start
:00401133 start:
:00401133 jmp short loc_401156
:00401133 ; --------------------------------------------

:00401156
:00401156 loc_401156: ; CODE XREF: :start
:00401156 mov ecx, 0FFBFEFCEh
:0040115B not ecx
:0040115D call ecx ; 401031



Checking the Call ECX, at the end of it all we simply have that return value to BaseProcessStart placed into EBP.

Code:

:00401031 sub_401031 proc near
:00401031 push esp // stack pointer
:00401032 pop ecx // 12FFC0
:00401033 add esi, esi
:00401035 shl ecx, 10h // FFC00000
:00401038 xor edx, 42426ADBh
:0040103E shr ecx, 10h // 0000FFC0
:00401041 sub eax, 1A6F87D3h
:00401047 add ecx, 120004h // 12FFC4
:0040104D sar esi, 16h
:00401050 mov ebp, [ecx] // 7C816D4F
:00401052 dec edx
:00401053 retn
:00401053 sub_401031 endp



Continuing with the original code after returning from the Call ECX, everything is pretty much useless obfuscated code until we start manipulating ESI at 004011A8.

EBP at this point is the return address in BaseProcessStart
:7C816D4F push eax


Code:

:0040115F sal ecx, 6
:00401162 sub edx, edx
:00401164 xor edx, 9
:0040116A rol esi, 16h
:0040116D add edx, esp
:0040116F add esi, 6F6B485Bh
:00401175 push edx
:00401176 pop eax
:00401177 inc esi
:00401178 sub esi, 8E56E123h
:00401178 ; --------------------------------------------
:0040117E dw 0
:00401180 dd 8 dup(0)
:004011A0 ; --------------------------------------------
:004011A0 ror cx, 8
:004011A4 or dx, 0FFA4h

:004011A8 mov esi, 0
:004011AD xor esi, 7 // 7 (to add 7 bytes)
:004011B3 inc ecx
:004011B4 rol edi, 0Eh
:004011B7 add esi, ebp // 7C816D4F + 7 = 7C816D56

ESI now points to the BaseProcessStart instruction
:7C816D56 90 nop


:004011B9 or eax, 7476EA6Bh
:004011BF mov ecx, 90h
:004011C4 sub cl, [esi] // 90 - 90 = 0
:004011C6 add di, 5Dh
:004011CA test ecx, ecx // test
:004011CC jz loc_401202 // jump if OK

:004011D2 push 0E7514E8Ah
:004011D7 call ds:ExitThread


So basically, all this code just checks if a NOP is found at the 4th instruction after the stack return address, if not it exits (actually crashes with an invalid Exit Code). Perhaps it is a check for a loader, or possibly an OS version check, since it will seemingly fail on XPsp3 (because it uses a hardcoded 120004h) for the same reasons as described by Zairon.

Kayaker