Swimming upstream... by j!m.

"I've always believed that history plays a very important part in reverse engineering and this gem from j!m discussing a tool from 1992 merely convinces me further, I particularly like the 'getting back' to good old Turbo Debugger and non-trivial nature of the final solution which required research to overcome. There are also some interesting links herein and some good examples of old fashioned intuition ;-), I will make no comment regarding the animated GIF inside the password protected zip ;-)." "Essay slightly edited by CrackZ".

Target Details

zipcrack 2.0 by Paul Kocher.

This is very old stuff designed in 1992 by a famous cryptographer (author of Differential Power Analysis attacks) to demonstrate weaknesses in the encryption algorithm used by PKZIP 1.0. The program is crippled and can only search for passwords beginning with the letter 'z' (seems simple..). Moreover the 'optimized search feature' which was using undocumented properties of the encryption header doesn't work anymore with the new version of the PKZIP algorithm. So you can't use the optimized search feature.

But it doesn't matter, nowadays you can find a lot of efficient tools to crack zip passwords. These tools allow you dictionary attacks, brute force, known plaintext attacks...try zipkey or advanced zip password recovery if you want to play with some of them. My goal is not to provide you the latest keygens or patches for the latest commercial releases. My goal is to learn and to share what I've learned. If you follow this essay, you should at least learn something about the PKZIP stream cipher (I did!).

Tools

Turbo Debugger for DOS, Hedit 2.11, a crypted archive (password : reverse), Windows calculator.

Launch Turbo Debugger, open the program and in the run menu choose the 'arguments...' entry to enter the full path of the crypted zip archive. As you can read in the zipcrack documentation, the program can only search for passwords beginning with the letter 'z'. Without being a wizard I can feel it...a closer look at my ASCII table informs me that the code for the letter 'z' is 0x7A so let's search the program source for the value '7A'. You should find something like this :

cs:0B21 cmp byte ptr [bp-0102],7A
cs:0B26 jne 0B2F
cs:0B28 cmp byte ptr [bp-0202],7A
cs:0B2D je 0B44
cs:0B2F cmp word ptr [3342],0000
cs:0B34 jne 0B44
cs:0B36 mov ax,3AEC
cs:0B39 push ax
cs:0B3A call 1C52
cs:0B3D pop cx
cs:0B3E mov word ptr [3342],0001

cs:0B44 mov byte ptr [bp-0102],7A
cs:0B49 mov byte ptr [bp-0202],7A
cs:0B4E mov byte ptr [si],7A
cs:0B51 cmp word ptr [bp+06],0000
cs:0B55 je 0B6E
cs:0B57 mov ax,3F70
cs:0B5A push ax
cs:0B5B mov ax,si
cs:0B5D inc ax
cs:0B5E push ax
cs:0B5F lea ax,[bp-0201]
cs:0B63 push ax
cs:0B64 lea ax,[bp-0101]
cs:0B68 push ax
cs:0B69 call 115B
cs:0B6C jmp 0B83

cs:0B6E mov ax,3F72
cs:0B71 push ax
cs:0B72 mov ax,si
cs:0B74 inc ax
cs:0B75 push ax
cs:0B76 lea ax,[bp-0201]
cs:0B7A push ax
cs:0B7B lea ax,[bp-0101]
cs:0B7F push ax
cs:0B80 call 0BBC
cs:0B83 add sp,0008
cs:0B86 pop di
cs:0B87 pop si
cs:0B88 mov sp,bp
cs:0B8A pop bp
cs:0B8B ret

OK, it looks like we have found the trick!, put a breakpoint on the first cmp instruction and run the program (do not forget to provide it the zip file as an argument). Don't use the optimized search. Type *reverse and press enter. The program breaks.

Dump memory at [bp-102]...you see the same thing as me? this is the password we typed!. Wonderful, the logic here is quite simple, the program tests if the string entered begins with the letter 'z', if not the program jumps and tests cmp word ptr [3342],0000 if it has already displayed the advertisement saying that it can only find passwords beginning with the letter 'z'. After this, it forces the first letter to 'z', tests if you want optimized search or not and branches according to your choice.

It seems that all we have to do here, is to change the jne 0B2F to a jmp 0B2F. So launch your favourite hex editor, open zipcrack.com, search for 750780BEFEFD7A and replace 7507 with EB29. Let's verify, open a DOS window and launch zipcrack with the good argument. I say yes, you say no, and oops it's not time for the Beatles yet! go on and type *reverse and smash the enter key to see the result of your hard work. NOTHING HAPPENS...maybe this program doesn't work well...but no, it works, if you test it upon an archive crypted with a password beginning by 'z' it prints 'MATCH'. So, where is the problem?.

To catch it, we have to trace deeper in the program and have a look into the PKZIP stream cipher implementation. Reload the program in Turbo Debugger, put a breakpoint on the cmp and run it. When the program breaks, go on tracing step by step into the code until you reach the call 0BBC. Have you noticed that the ax register is loaded and pushed with the address of the string you entered + 1? seems odd. Go on tracing and enter into the call function until you reach these lines :

cs:0BFA mov dword ptr [3F78],EE1C557A
cs:0C03 mov dword ptr [3F7C],D2149410
cs:0C0C mov dword ptr [3F80],98E676C4

Do you feel it? it looks like crypto initialization or something like that. At this point it is time to have a closer look into the PKZIP stream cipher specification. I suggest you read a paper published by +Tsehp called 'ZIP Attacks with Reduced Known-Plaintext', the paragraph 1.1 of this paper describes the cipher and says that :

'...The internal state of the cipher consists of three 32-bit words: key0, key1, key2. these values are initialized to 0x12345678, 0x23456789, 0x34567890, respectively...'

It doesn't look like the instructions we found in the program.

'..The cipher is keyed by encrypting the user password and throwing away the corresponding stream bytes. The stream bytes produced after this point are XORed with the plaintext to produce the ciphertext...'

The important thing to understand here is that a stream cipher has an internal state defined by key0, key1 and key2 and in order to decrypt something we have to update this state with the password bytes. Are you curious? what is the state of the cipher after it processed the character 'z' ? good question. Answer : let's compute it together..first the initialization.

Key(0) = 0x12345678
Key(1) = 0x23456789
Key(2) = 0x34567890

Compute PKZIP_stream_byte('z') like this :

key(0) = crc32(key(0) , 0x7A) where
crc32(crc,b)=((crc >> 8) ^ crctab[(crc & 0xFF) ^ b])

I've searched some PKZIP source code and found the crc32 table used, find it by yourself or believe the value I give you here.

key(0) = ((0x00123456) ^ crctab[78 ^ 7A]) = 0x00123456 ^ crctab[2] = 0x00123456 ^ 0xEE0E612C = 0xEE1C557A

Looks pretty good no...

key(1) = (key(1)+(key(0) & 0xFF)) * 0x08088405 + 1 = (0x23456789 + 7A) * 0x08088405 + 1 = 0xD2149410

Good! let's going on...

key(2) = crc32 (key(2), key(1) >> 24) = 0x00345678 ^ crctab[90 ^ D2] = 0x00345678 ^ crctab[0x42] = 0x00345678 ^ 0x98D220BC = 0x98E676C4

That's it!, Paul Kocher has 'forwarded' the cipher to the next state. All we have to do is to move the cipher state backwards.
To do this, we have to modify the three mov instructions to put the correct initialization values, but that's not all, we have to correct the address loaded in ax before the call, to point to the beginning of the password string, that means nop'ing the inc ax instruction, change lea ax,[bp-0201] to lea ax,[bp-0202] and change lea ax,[bp-0101] to lea ax,[bp-0102].

I'll let you do this with a hex editor, don't forget that Intel computers are little-endian ;-) so you have to search 0x7A551CEE and replace it by 0x78563412 and so on for key1 and key2 and all things should work well after that.

That's all folks!.

Additional readings

APPNOTE.TXT, the PKZIP format official specification, find the latest on the PKWARE site. To learn more about stream ciphers I suggest this link. Have any errors, comments, suggestions?, write to me!.



© 1998, 1999, 2000, 2001 By j!m, proudly hosted by CrackZ. 19th September 2001.