Log in

View Full Version : reversing an algorithm


Vigual
December 9th, 2009, 20:00
I am trying to reverse an algorithm but I have a few questions. The purpose of this algorithm is to find the dword that contains the last character in a string. Basically, it searches for a 00 byte in the dword.

THE FIRST PART:

ESP+4 is a pointer to an address. So ECX contains an address. By TESTing the address with 3, it seems like it checks to see if the address is a multiple of 4, since if it is not it will have bits sent between 3 and 0. I don't really get why that is done here. It seems like it's because the algorithm is working in dwords, which are 4 bytes, but I still don't get why that matters because you can get dwords from any address, not just ones that are multpiles of 4. Maybe I'm wrong here.

Code:

MOV ECX,DWORD PTR SS:[ESP+4]
TEST ECX,3
JE SHORT 004A8A80

004A8A5C - MOV AL,BYTE PTR DS:[ECX]
ADD ECX,1
TEST AL,AL
JE SHORT 004A8AB3
TEST ECX,3
JNZ SHORT 004A8A5C

ADD EAX,0
LEA ESP,DWORD PTR SS:[ESP]
LEA ESP,DWORD PTR SS:[ESP]



THE SECOND PART
This second part seems to check if there are 00s in the dword. Here's what I think I know about the algorithm. 7EFEFEFF xor FFFFFFFF = 81010101. So, what I guess should happen is

Code:
ADD EDX,EAX
XOR EAX,FFFFFFFF
XOR EAX,EDX


Should cause 7EFEFEFF to be inverted to 81010101, it should also add the inital value of EAX to EDX, which is obviously done, then later subtract intial EAX which I don't see. I don't really get how this is used to test if EAX contains a 00.

Code:

004A8A80 - MOV EAX,DWORD PTR DS:[ECX]
MOV EDX,7EFEFEFF
ADD EDX,EAX
XOR EAX,FFFFFFFF
XOR EAX,EDX
ADD ECX,4
TEST EAX,81010100
JE SHORT 004A8A80




Here is the entire code just for context


Code:

MOV ECX,DWORD PTR SS:[ESP+4]
TEST ECX,3
JE SHORT 004A8A80

004A8A5C - MOV AL,BYTE PTR DS:[ECX]
ADD ECX,1
TEST AL,AL
JE SHORT 004A8AB3
TEST ECX,3
JNZ SHORT 004A8A5C

ADD EAX,0
LEA ESP,DWORD PTR SS:[ESP]
LEA ESP,DWORD PTR SS:[ESP]

004A8A80 - MOV EAX,DWORD PTR DS:[ECX]
MOV EDX,7EFEFEFF
ADD EDX,EAX
XOR EAX,FFFFFFFF
XOR EAX,EDX
ADD ECX,4
TEST EAX,81010100
JE SHORT 004A8A80



I hope my explanation is not too confusing.

drizz
December 9th, 2009, 21:38
reversing strlen?
Code:
\VC\crt\src\intel\strlen.asm

Vigual
December 9th, 2009, 23:55
I'm new. Sorry. Could you explain how the function works? I don't understand why the values 7EFEFEFF and 81010100 are picked. Why does it make sure the address is divisible by 4? I want to learn.

drizz
December 10th, 2009, 07:03
This is speed optimized strlen.

Checks if address is divisible by 4 so it can read a dword (faster than byte read) and not cause a page fault because unaligned reads access two dwords.

Any zero byte in dword will cause the bits at positions selected by 0x81010100 to be set. Or, if all bytes are non zero "test eax,0x81010100" will always have zero flag set. It works for any dword 0x1-0xFFFFFFFF.

Here's similar tehnique by Agner Fog
Code:
; Main loop, read 4 bytes aligned
L1: add eax, 4 ; increment pointer by 4
L2: mov ebx, [eax] ; read 4 bytes of string
lea ecx, [ebx-01010101H] ; subtract 1 from each byte
not ebx ; invert all bytes
and ecx, ebx ; and these two
and ecx, 80808080H ; test all sign bits
jz L1 ; no zero bytes, continue loop


crt_strlen creates overflow by adding 0x7efefeff to dword, agner_strlen creates overflow by substracting 0x01010101 from dword.

Vigual
December 10th, 2009, 09:21
thanks so much for the reply

bilbo
December 10th, 2009, 15:26
http://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord

dELTA
December 14th, 2009, 18:21
Always nice to see you're still around Bilbo, and happy anniversary on your 256th post!

bilbo
December 15th, 2009, 00:24
Quote:
[Originally Posted by dELTA]Always nice to see you're still around Bilbo, and happy anniversary on your 256th post!

thanks, dELTA, I appreciate your hex view of the world; I hope I will not overflow anyway... In my day, processors had only 8 bits!

Best regards, bilbo