reversing an algorithm [Archive] - RCE Messageboard's Regroupment

View Full Version : reversing an algorithm

Vigual

December 9th, 2009, 20:00

I am trying to reverse an algorithm but I have a few questions. The purpose of this algorithm is to find the dword that contains the last character in a string. Basically, it searches for a 00 byte in the dword.

THE FIRST PART:

ESP+4 is a pointer to an address. So ECX contains an address. By TESTing the address with 3, it seems like it checks to see if the address is a multiple of 4, since if it is not it will have bits sent between 3 and 0. I don't really get why that is done here. It seems like it's because the algorithm is working in dwords, which are 4 bytes, but I still don't get why that matters because you can get dwords from any address, not just ones that are multpiles of 4. Maybe I'm wrong here.

Code:



                 MOV ECX,DWORD PTR SS:[ESP+4]

                 TEST ECX,3  

                 JE SHORT 004A8A80



004A8A5C - MOV AL,BYTE PTR DS:[ECX]

                 ADD ECX,1

                 TEST AL,AL

                 JE SHORT 004A8AB3

                 TEST ECX,3

                 JNZ SHORT 004A8A5C



                 ADD EAX,0

                 LEA ESP,DWORD PTR SS:[ESP]

                 LEA ESP,DWORD PTR SS:[ESP]

THE SECOND PART
This second part seems to check if there are 00s in the dword. Here's what I think I know about the algorithm. 7EFEFEFF xor FFFFFFFF = 81010101. So, what I guess should happen is

Code:

 ADD EDX,EAX

                 XOR EAX,FFFFFFFF

                 XOR EAX,EDX

Should cause 7EFEFEFF to be inverted to 81010101, it should also add the inital value of EAX to EDX, which is obviously done, then later subtract intial EAX which I don't see. I don't really get how this is used to test if EAX contains a 00.

Code:



004A8A80 - MOV EAX,DWORD PTR DS:[ECX]

                 MOV EDX,7EFEFEFF

                 ADD EDX,EAX

                 XOR EAX,FFFFFFFF

                 XOR EAX,EDX

                 ADD ECX,4

                 TEST EAX,81010100

                 JE SHORT 004A8A80

Here is the entire code just for context

Code:



                 MOV ECX,DWORD PTR SS:[ESP+4]

                 TEST ECX,3  

                 JE SHORT 004A8A80



004A8A5C - MOV AL,BYTE PTR DS:[ECX]

                 ADD ECX,1

                 TEST AL,AL

                 JE SHORT 004A8AB3

                 TEST ECX,3

                 JNZ SHORT 004A8A5C



                 ADD EAX,0

                 LEA ESP,DWORD PTR SS:[ESP]

		  LEA ESP,DWORD PTR SS:[ESP]



004A8A80 - MOV EAX,DWORD PTR DS:[ECX]

                 MOV EDX,7EFEFEFF

                 ADD EDX,EAX

                 XOR EAX,FFFFFFFF

                 XOR EAX,EDX

                 ADD ECX,4

                 TEST EAX,81010100

                 JE SHORT 004A8A80

I hope my explanation is not too confusing.

drizz

December 9th, 2009, 21:38

reversing strlen?

Code:

\VC\crt\src\intel\strlen.asm

Vigual

December 9th, 2009, 23:55

I'm new. Sorry. Could you explain how the function works? I don't understand why the values 7EFEFEFF and 81010100 are picked. Why does it make sure the address is divisible by 4? I want to learn.

drizz

December 10th, 2009, 07:03

This is speed optimized strlen.

Checks if address is divisible by 4 so it can read a dword (faster than byte read) and not cause a page fault because unaligned reads access two dwords.

Any zero byte in dword will cause the bits at positions selected by 0x81010100 to be set. Or, if all bytes are non zero "test eax,0x81010100" will always have zero flag set. It works for any dword 0x1-0xFFFFFFFF.

Here's similar tehnique by Agner Fog

Code:

        ; Main loop, read 4 bytes aligned

L1:     add     eax, 4                 ; increment pointer by 4

L2:     mov     ebx, [eax]             ; read 4 bytes of string

        lea     ecx, [ebx-01010101H]   ; subtract 1 from each byte

        not     ebx                    ; invert all bytes

        and     ecx, ebx               ; and these two

        and     ecx, 80808080H         ; test all sign bits

        jz      L1                     ; no zero bytes, continue loop

crt_strlen creates overflow by adding 0x7efefeff to dword, agner_strlen creates overflow by substracting 0x01010101 from dword.

Vigual

December 10th, 2009, 09:21

thanks so much for the reply

bilbo

December 10th, 2009, 15:26

http://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord

dELTA

December 14th, 2009, 18:21

Always nice to see you're still around Bilbo, and happy anniversary on your 256th post!

bilbo

December 15th, 2009, 00:24

Quote:

[Originally Posted by dELTA]Always nice to see you're still around Bilbo, and happy anniversary on your 256th post!

thanks, dELTA, I appreciate your hex view of the world; I hope I will not overflow anyway... In my day, processors had only 8 bits!

Best regards, bilbo