Log in

View Full Version : Question about an algorithm


WaxfordSqueers
June 4th, 2009, 01:43
I have a file that seems to be packed, but it's not malware and the aim may not be to divert reversers. Then maybe it is. It's a BIOS file that I want to load in a specific utility from a specific BIOS vendor. I have a known good file of similar size and format that the utility accepts.

The file is 512k long (0x80000) and it has an ASCII signature word located 10 bytes in from a paragraph boundary. So, if the word is foofoo, and it's found in the paragraph at F4000, foofoo will begin at F400A. The app knows that but has to find it.

foofoo is followed by about 10 bytes. First, the app searches the file for foofoo, looking for it always at 0xA bytes into a line (from the 0 position) and it searches every other line by incrementing the inital base of file address by 0x20, until it finds it. Then it compares it to bytes at EOF-20. There is never a match with the signature word but the good file loads anyway.

After that, the app takes the last two bytes at the end of the string after foofoo and puts them in EAX. It puts the two bytes before those bytes in ECX. Then it does an SHL 4 on the EAX to multiply the bytes by 10 and it adds the bytes in the ECX to EAX. That gives a value larger than the file size but not by much. The filesize, 0x80000 is subtracted from that value.

The derived value is added to the file base address to get an offset into the file which is always on a boundary. Then, the bytes located 7 bytes into that offset are checked against a magic number. They never seem to match. After that, bytes are taken from the new offset code and operated on exactly as before, with two specified bytes loaded into the EAX, a SHL 4, and an addition of ECX. The process is repeated, with the new magic being always added to the file base. The offset it produces is compared against magic numbers and the app works through the code looking for something.

I'm wondering if anyone recognizes this algorithm? I'm not looking for a specific answer, just a bit of feedback. Any ideas are welcome. The values being checked don't seem to be hard values programmed into the file, so I've considered the algorithm may be a CRC check of some kind. The values in the string following foofoo may be the result of a CRC algorithm run on the app initially. I have also considered the algorithm may be part of an unpacking scheme. The file is moved around in memory a lot.

I'm asking because the loop for testing and creating the magic numbers (for new offsets) is long and it would help if I had an idea with regard to what is going on.

evaluator
June 4th, 2009, 03:16
maybE, you should show algo-disasm code.

WaxfordSqueers
June 5th, 2009, 01:56
Quote:
[Originally Posted by evaluator;80936]maybE, you should show algo-disasm code.
Thanks for response evaluator. Here's one better, in case you're interested in examining the BIOS:

I found the BIOS file here:

http://rapidshare.com/files/136844316/A163FIL1.rom

And some BIOS tools here:

http://www.rebelshavenforum.com/sis-bin/ultimatebb.cgi?ubb=get_topic;f=52;t=000004;p=0

They all seem legit...links are to free tools available on the net. The one to which I am refering is under "Newly Discovered AMI Tools Package for AMI BIOS: Tool_8_RC1.rar"

If you observe how it loads the ROM file, you'll notice the code at 4015A2 is examining the ROM file and it loops between there and 40160D. There are 3 significant checks in the loop. Just after 4015A9, it is checking [ESI+7]. Then it tests EAX to see if it is 0. On the previous routine, just before the RET, it puts a 1 in EAX.

At 4015C1, it compares [ESI+6] to 80. Then near the end of the loop, it compares [ESI] to -1. There are different sized areas throughout the BIOS file filled with FF's, but from what I can see, separation between sections is done woth 00's.

Here's the code for the section in question:

Code:
4015A2 lea esi, [eax+edi]<- magic # + file base
4015A5 cmp esi, ebx
4015A7 ja short loc_40161A
4015A9 movzx eax, [ebp+arg_8] <- usually = 0
4015AD mov cl, [esi+7] <-ESI points to base + magic
4015B0 sub eax, 0 <-test #1
4015B3 jz short loc_4015C1 <- still in loop
4015B5 dec eax
4015B6 jnz short loc_4015F4
4015B8 cmp dl, [ebp+arg_C]
4015BB jnz short loc_4015F4
4015BD mov eax, esi
4015BF jmp short loc_40161C

4015C1 mov al, [ebp+arg_C] <- magic #2 = 80
4015C4 cmp [esi+6], al <-cmp to byte 6 in magic offset
4015C7 jnz short loc_4015F4 <-test#2
4015C9 test cl, 40h
4015CC jnz short loc_4015F4
4015CE test cl, 20h
4015D1 jz short loc_4015E1
4015D3 cmp [ebp+var_8], 0
4015D7 jnz short loc_4015DC
4015D9 mov [ebp+var_8], esi

4015DC inc byte ptr [ebp+arg_4+3]
4015DF jmp short loc_4015F4

4015E1 mov al, byte ptr [ebp+arg_0+3]
4015E4 cmp al, [ebp+arg_10]
4015E7 jz short loc_40160F
4015E9 and byte ptr [ebp+arg_4+3], 0
4015ED and [ebp+var_8], 0
4015F1 inc byte ptr [ebp+arg_0+3]

4015F4 push [ebp+var_C]
4015F7 inc dl
4015F9 mov [ebp+var_1], dl
4015FC push [ebp+var_10]
4015FF push esi
401600 call sub_401EBF <- more magic numbers created
401605 cmp dword ptr [esi], 0FFFFFFFFh <-???
401608 jz short loc_40161A <-loop exit if [EAX]= -1
40160A mov dl, [ebp+var_1]
40160D jmp short loc_4015A2 <-back you go.


The aim is to unpack the BIOS file, although there is a tool called MMTool that will unpack segments. You can find it in the package listed above. I am trying to understand how the BIOS files are packed with the hope that I can put my AMI-based BIOS file together so I can read it. Unfortunately, it is split into 6 different files.

If I can put it together, I can hopefully identify the segment that deals with the Silicon Image controller and replace it with a more recent version. I know they are using a Silicon Image controller even though they refer to a Promise controller in the BIOS. Don't want to think what that means.

WaxfordSqueers
June 5th, 2009, 02:04
Quote:
[Originally Posted by evaluator;80936]maybE, you should show algo-disasm code.
In case this interests you, and to save you a lot of work, the fun starts at 401265. You'll know what I mean if you examine the file as I described. That's where the file base is established and shortly thereafter, the entire file is moved to a new location with memcpy. You will see the file size compared to 0x7fff and 0x100000. Then it looks for it's signature, which is quite obvious. After that, it starts playing with the magic numbers it finds tacked on to the end of the signature.

evaluator
June 5th, 2009, 13:03
in rom file i see string 'RSA1'

WaxfordSqueers
June 5th, 2009, 22:34
Quote:
[Originally Posted by evaluator;80958]in rom file i see string 'RSA1'
Thanks for tip. The RSA1 is a part of a small segment in the file called a 'ROM hole'. I don't know exactly what that is yet but I think it's like a code hole people look for to insert code, only this one is there on purpose. There is another ROM hole that is much bigger (64k) and sits right at the front of the ROM. On my Intel ROM, that area begins with a large chunk of FF's, so maybe it's a ROM hole as well.

Here's the hex dump for the smaller ROM hole:

Code:
0000 0000 9C00 0000 0602 0000 0024 0000
5253 4131 0004 0000 0100 0100 8771 4FEE
BDFF A49A 0C31 8926 2285 DF17 C576 8256
08FF 5607 94CF 2070 CC44 B486 D1CA 68EE
34D3 9C1A F1BA 18E3 8E40 26BA 658A DD6D
9285 BFE9 1006 8EFE 52FA 317C 1ED2 2100
4B44 9B98 98CF 4C43 2A41 AB8E 2FE0 FB92
CCB5 DD05 D5DA 2965 5AA4 C944 7FAA B3AB
FACF 5D7A 1923 4DE9 0971 CFDC 8761 12D0
C248 4427 374C 505B 2D42 66C4 0100 0000
B600 0000 0000 0200 4C47 4520 2020 4C47
5043 2020 2020 5749 4E44 4F57 5320 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 5EC2 0E30 5A6A 4913 063F A3EC F76C
4B7F CC74 9855 910F 6084 7260 02C3 683B
0C41 E290 D869 76A7 3D8B 1F49 6471 9D54
89A4 75D9 F3F3 1BAA 356B E5D1 A324 83EE
E2BE 5AA8 054A 2BFD 2F5D 38EF 78E4 F8E7
B9F6 4366 A755 049F 0639 1EA7 E720 11FE
2861 0452 AE76 B0D4 546A 60EB CD6A 9A1A
55CB 3FFC 3B3A 7414 1273 72E6 5FA0 811A
26FD FFFF FFFF FFFF FFFF FFFF FFFF FFFF


You can see the 52524131=RSA1 header at the top. There's also a reference in the middle of the file to LGE=LG Electronics, I think. It's 4C4745 and there's a reference to Windows a few bytes after it.

Darren
June 6th, 2009, 06:01
Looks like the

0000 0100 0100 is your rsa e (10001) 65537

and

C466422D5B504C37274448C2D0126187DCCF7109E94D23197A5DCFFAABB3AA7F44C9A45A6529DAD505DDB5CC92FBE02F8EAB 412A434CCF98989B444B0021D21E7C31FA52FE8E0610E9BF85926DDD8A65BA26408EE318BAF11A9CD334EE68CAD186B444CC 7020CF940756FF08568276C517DF85222689310C9AA4FFBDEE4F7187

is your rsa n (modus)

1024bit

WaxfordSqueers
June 6th, 2009, 07:23
Quote:
[Originally Posted by Darren;80966]Looks like the 0000 0100 0100 is your rsa e (10001) 65537....
Thanks for input Darren. I traced through it all the way tonight and it does use some heavy math decoding in one part but it seems related to the SLP20 issue for Vista. Apparently they are encoding BIOSes in Vista for OEM, or something. The ASCII for SLP20 is right in the area of the code being manipulated.

There is a lot of heap activity and moving large chunks of data with memcpy to a heap area. I wasn't looking for decoding as much tonight as just making signposts so I could retrace my steps later. At one point, there was some coprocessor work with large decimal fractions like 1.009999999......

They may be using the RSA1 encryption as part of that. It just occured to me that my BIOS is much older than that and that I may need to use an older BIOS utility with an older 512k BIOS. The one I referenced here is from 2008 and my BIOS is circa 2003. That SLP20 part is not in my BIOS and that may be a good part of the reason why the utility is not able to load it.

I trudged through an SHA1 encryption once, letting the program do the unencrypting. It was a matter of finding the accumulation points where they gather and manipulate their keys. By setting a BP on those points I could run the entire sections of heavy code without having to get bleary-eyed over the decoding. I was hoping to do the same here. However, the SHA1 was used as a CRC check, not as an encryption of the code itself.

Darren
June 6th, 2009, 11:47
With it using a hash and rsa maybe some signature is being verified ??

I assumed you noticed the

8771 4FEE
BDFF A49A 0C31 8926 2285 DF17 C576 8256
08FF 5607 94CF 2070 CC44 B486 D1CA 68EE
34D3 9C1A F1BA 18E3 8E40 26BA 658A DD6D
9285 BFE9 1006 8EFE 52FA 317C 1ED2 2100
4B44 9B98 98CF 4C43 2A41 AB8E 2FE0 FB92
CCB5 DD05 D5DA 2965 5AA4 C944 7FAA B3AB
FACF 5D7A 1923 4DE9 0971 CFDC 8761 12D0
C248 4427 374C 505B 2D42 66C4

is the modus in reverse order

FrankRizzo
June 6th, 2009, 12:23
I remember reading about a crypto-identifier tool somewhere. Have you run it on this app? It should be able to tell you if there is in fact RSA knda code in there.

WaxfordSqueers
June 6th, 2009, 18:02
Quote:
[Originally Posted by FrankRizzo;80970]I remember reading about a crypto-identifier tool somewhere. Have you run it on this app? It should be able to tell you if there is in fact RSA knda code in there.
A quick scour of the net hasn't revealed anything yet but here are some interesting articles on RSA from that world-famous reversing site:

http://www.woodmann.com/forum/showthread.php?t=11143&highlight=rsa

http://www.woodmann.com/forum/showthread.php?t=11864&highlight=rsa

http://www.woodmann.com/forum/showthread.php?t=11823&highlight=rsa

Thanks for the tip. I'll keep looking but I'm not sure what I'm looking for yet. From reading through the link articles I provided above, encryption is used for different reasons, not always to thwart a reverser. I want to decode the ROM file to get an idea what each segment does and specifically to identify the area used by manufacturers for devices like SATA controller. I have a newer firmware set that I'm hoping can be incorporated into the main BIOS. Then again, it might be aimed at the controller chip on the mother board.

Hard drives use both. Much of the firmware for a hard drive resides on the disk itself, in areas unaccessible to the OS. However, much of the firmware is duplicated in the drives ROM area in it's microprocessor chip. If the drive can't find the info it needs on the disk, because it's damaged, it can sometimes find it in it's onboard ROM.

Thinking from a hardware perspective, some Flash memory modules are designed to receive 512k chunks of data. I would imagine some of the compression is aimed at getting more into one of those segments and not to thwart a reverser. It's quite a mental juggling act trying to think of what a ROM file is about but I'm going on the premise that the compressed/encrypted ROM is decompressed/unencrypted at some point in memory.

It would be really handy if an app existed, like the one you describe, but I'll poke around some more and see if the AMI utility will do the dirty work. The reason I posted in this section was to see if anyone could ID any compression/encryption from the algorrithm I described. It turns out the process is far more complicated than the routine I mentioned.

disavowed
June 16th, 2009, 21:27
Quote:
[Originally Posted by FrankRizzo;80970]I remember reading about a crypto-identifier tool somewhere. Have you run it on this app? It should be able to tell you if there is in fact RSA knda code in there.


the kanal plugin for peid is pretty good

WaxfordSqueers
June 18th, 2009, 19:14
Quote:
[Originally Posted by disavowed;81154]the kanal plugin for peid is pretty good
Thanks for tip, disavowed. I should have updated this post but I'm still working on the problem.

The algorithm is a linked list. It begins with a 16-bit segment/offset, which points to a string before the last module in a module segment. It gets another segment/offset in that string which points to a previous string, located before the previous module. It works it's way to the front of the file till it reaches a dword of FF's.

Later, they use a variation of that. The utility which loads and deciphers the ROM files has a data area made up of keys followed by pointers. The last ROM segment has various keys in it and the utility takes a byte-length key from the segment and searches through its data area for the key. When it finds the key, it jumps to the pointer after the key, which is code. The different code areas called perform specific functions in the ROM file.

The modules in the ROM file are mainly compressed but some are not. There are CRC checks as well which are not too complex. There is another tool which will extract the module you select and it will either decompress it or leave it compressed. My understanding is that Award and AMI use LHA compression. I still haven't encountered the RSA encryption talked about earlier in the post and I'm keeping my fingers crossed that I don't need to deal with it. .