Log in

View Full Version : Access violations and other dirty deeds.


WaxfordSqueers
November 17th, 2006, 20:13
I'm back in the noob section because I still have a lot of noob questions to ask. I was just in Tools of Our Trade, and although Kayaker is always kind and generous with his responses, by the time I finish trying to unravel what he said, I feel better coming back here. He's so far ahead of me it's scary.

Anyway, I'm working on a failed installation involving an Inno installer. I guess it's the poor man's Installshield. It has one setup file and a huge binary compressed with the same compressor used by 7zip. There are a dozen files to be decompressed, but while it's decompressing a 300 meg file, it stops with an error message about an access violation at address xxxxxxxx with Read yyyyyyyy.

I can think of several types of access violations, but they usually involve memory that isn't there, or a bad pointer. This error happens at the same code point in the setup file, which is actually an image of setup.exe mapped into memory from the temp folder in 'documents and settings'. I've tried to run this setup in XP SP2, win 98 and with a utility that can decompress this kind of compression. The outcome and fail addresses are similar.

Could this be just messed up data in the file being decompressed, or is there something else that will produce this kind of error? I would think a disk read error would produce a different error, so it seems something in the bad file is making the software point to a bad memory location. I'm a bit stumped and would appreciate input.

The code location is 4585e9 and the read location is 7a4052. The code at 4585e9 is MOV CL, [EAX+EDX], where edx is a constant and eax keeps changing. I'm assuming that when eax+edx = 7a4052, the system is unable to read that location. I haven't figured out yet how the eax value is derived. I'm just wondering if I'm up the wrong trail.

I tried BPX 4585E9 IF (eax+edx)==0x7a4052, making sure I was in the right address context. I tried the same with a BPM using RW, but neither will break. Can anyone see anything wrong with the expressions?

Kayaker
November 17th, 2006, 23:39
Aw shit Waxford, you shouldn't feel that way. At the risk of chasing you elsewhere, have you tried setting the breakpoint to *before* the error occurs, i.e. at
BPX 4585E9 IF (eax+edx)==0x7a4051
assuming eax keeps increasing by 1

or similarly at the address immediately *before* the faulting instruction 4585E9
BPX 4585Ex IF (eax+edx)==0x7a4052

That way you trace *up* to the error rather than trying to break *at* the error. I don't know if it would work but you might also try FAULTS ON and just let it run, Softice may do the breaking for you.

K.

LLXX
November 17th, 2006, 23:41
Quote:
[Originally Posted by WaxfordSqueers;62486]7zip ... access violation ... 300 meg file
Most likely a hardware problem, specifically the RAM.

Approximately a year ago I was puzzling over why decompressing certain large 7zip archives resulted in errors such as what you've mentioned. It turned out my RAM, which I'd overclocked to 434MHz from 400, seemed to work fine for everything else but failed at this.

Your RAM could just be failing if it's running at default speed. Try running MemTest86+ on your machine for a few hours.

WaxfordSqueers
November 18th, 2006, 00:36
Quote:
[Originally Posted by Kayaker;62489]Aw shit Waxford, you shouldn't feel that way.
I was only kidding, Kayaker, I enjoy your exposes. I just wish I had your experience, specially with dissecting softice. You didn't chase me out, I was waiting to see how things might develop with your examination of the hwnd problem. Anyway, I couldn't really see another forum section that combined my lack of experience and the topic.

Quote:
[Originally Posted by Kayaker;62489] At the risk of chasing you elsewhere, have you tried setting the breakpoint to *before* the error occurs, i.e. at
BPX 4585E9 IF (eax+edx)==0x7a405
Yeah...I've tried playing with that but here's the problem. The file is huge (300 meg) and this address is in the middle of an iteration loop that has over 2 million iterations. EDI counts from 1 to 0x200000, and the compare for that is a few statements past 4585E9.

What I have done is BPX'ed one code step after the jump at the end of the iteration loop, then F5'd till the error message broke. From the beginning of the file processing, it takes 14 x F5's (BPX is one after the end of the 0x200000 iteration loop) till the error message comes up. So, I do 13 x F5 and that lands me within 0x200000 iterations of the fault.

The eax reg is not indexing by 1. The value varies, and where the value eax+edx==7A4052 hits, I don't know. I have tried BPXing right on that value, which should be DS:7a4052. You can watch it in the softice window. When the statement MOV cl, [eax+edx] at 4585e9 is the EIP, the value for eax+edx is shown at the top right of the screen as DS:xxxxxxxx. I'm assuming that's the READ to which the error refers.

There's a large amount of code in the iteration loop with numerous calls. What I'll have to do is sit down and see if I can decipher some of it and maybe find out where the eax is getting it's value. I'm sure it has something to do with the decompressing routine, and I think the source code for 7zip is available. This installation uses the 7zip compression algorithm, but it implements it using it's own code.

I may be chasing shadows anyway. I'm hoping one byte is out of whack and maybe I can fiddle it to get by the access violation. Mind you, I've seen some pretty weird causes of those kinds of errors, one of them being a registry access that didn't have the required permissions, This app is not accessing the registry, however.

I'll try your Faults On idea. Thanks for responding.

WaxfordSqueers
November 18th, 2006, 01:00
Quote:
[Originally Posted by LLXX;62490]Most likely a hardware problem, specifically the RAM.......Your RAM could just be failing if it's running at default speed. Try running MemTest86+ on your machine for a few hours.


thanks for the tip LLXX. I've considered that, and it's a possibility. When I bought this system, I couldn't load XP. I could load Win98, and it would run, but not XP. It always froze at MUP.SYS as it was loading. I checked RAM using the app you suggest and even swapped RAM...no luck. I took it back and asked the vendor to check the processor. He was incredulous, thinking I was certifiable. The minute he changed the processor (P4 Celeron 2000), away it went. So, anything is possible. Of course, then he looked at me like I had screwed it when I installed it.

The reason I don't think it's RAM is that I've tried the install 3 different ways. Each time, the app stops at the same statement, MOV CL, [EAX+EDX]. It does the same in XP, Win98 and in a DOS-based decompression utility made especially for Inno installers. However, the READ portion of the error is different in each instance, suggesting to me that a different portion of RAM is used in each case. But, I will check it.

I never overclock. I'm an electronics/computer tech, among other things, and it gives me the willies to run hardware faster than it's meant to go. What I'm thinking is average power. The chips are designed to handle a certain amount of power, and when you increase the clock speed, you are upping the power consumption. If the chips can dissipate the power, fine. But if they are on the edge...trouble!! And as you found out, you can get timing problems.

The tiny transistor junctions in the chips have a finite capacity to charge and discharge. They act like tiny capacitors, in a way, and there is a delay to the rising and decaying edge of a square wave. If you seriously overclock them, the ability of the junctions to respond is impaired, and that can affect the timing. In order to increase the overall speed in motherboards and chips, engineers have to continually tweak parameters and allow for the in-built limitations of transistors at high speed. The design of motherboards alone, with the ground planes and decoupling capacitors, is an art in itself.

LLXX
November 18th, 2006, 03:06
Quote:
[Originally Posted by WaxfordSqueers;62493]MOV CL, [EAX+EDX]
This looks like part of the LZ repetition loop (i.e. copies repeated sequences of bytes). You should see a "MOV [...], CL" somewhere below it if my hypothesis is correct.

It is also known to be the most stressing routine of the decompression code, as makes rapid interlaced reads and writes to the RAM.

If running it on another machine also results in the same error, then the compressed data is likely corrupted (it tried to read beyond the end of the LZ buffer). Take some time and read through some docs on how LZ compression works.

WaxfordSqueers
November 18th, 2006, 03:27
Quote:
[Originally Posted by LLXX;62495]This looks like part of the LZ repetition loop (i.e. copies repeated sequences of bytes). You should see a "MOV [...], CL" somewhere below it if my hypothesis is correct. It is also known to be the most stressing routine of the decompression code, as makes rapid interlaced reads and writes to the RAM.


Thanks for the input LLXX, I think your right. I've shut it down for the night, but I'll verify tomorrow. I see the same instructions in other places and it uses DL as well as CL. ALso, as I watch the value of [eax+edx], they jump back and forth betweem Axxxxx and Bxxxxx, sometimes dipping down to 9xxxxx. I was wondering what it was doing.

Yes, I'll have to do some reading. If the data is corrupt, I'm sure it will be in more than one byte. All the other large files, one is over a gig, are fine. I managed to decompress them all except the one file.

SiGiNT
November 18th, 2006, 12:49
Am I missing something here? or is my assumption incorrect? it just seems that moving a DWORD into 1 WORD of the register is fundamentally wrong? I would re-down the prog and retry.

SiGiNT

WaxfordSqueers
November 18th, 2006, 14:22
Quote:
[Originally Posted by sigint33;62501]Am I missing something here? or is my assumption incorrect? it just seems that moving a DWORD into 1 WORD of the register is fundamentally wrong? I would re-down the prog and retry.SiGiNT


I assume you're refering to the MOV CL, [EAX+EDX]? I'm no expert, but to me, the square brackets around EAX+EDX tell me it's the value found at the address of EAX+EDX that's to be moved to CL. Here's how it looks on the softice screen and in IDA:

4397FE mov cl, [edx+eax]

The address in the error message changes depending on what app I'm using. The above refers to the decompressor innounp.exe, which can be freely downloaded. The error occurs at exactly the same statement and in the same code in the app's setup.exe decompressor.

In softice, if I am currently at that EIP, the value for [EDX+EAX] is highlighted at the top right of the screen and is prefixed with a DS:, telling me it's in the data segment. It might show ds:000A3348 = FD, and to me, that's the equivalent of:

[EDX+EAX] = [DS:000A3348] = FD

So, the app is always moving a byte to CL, which is correct. A word in ECX would be CX, and CX = CL + CH. ECX is a doubleword, CX is a word and CL and CH are bytes. Since the function of the app is to decompress data, I would imagine it would be operating on bytes only, and the value moved to CL is definitely a byte.

Also, this same code successfully decompresses other large files, one of which is over a gigabyte in size.

Your point is a good one in another sense, however. When I use the softice conditional expression:

BPX 4397FE IF (EDX+EAX)==0x92FF16, you will notice they (softice people) use curved brackets. I find that confusing because in one of their examples, they use the expressions:

BPM es:di+1f w if (*(es:di+1f)==0x10) and
BPM ds:80150000 if (byte(*ds:80150000)>5)

In both of these statements, I'm assuming the little * refers to a pointer. So, the expression (byte(*ds:80150000) should refer to the byte value at address ds:80150000. Also, the expression *(es:di+1f) would refer to the value at address es:di+1f. Maybe someone could straighten me out if I'm wrong. I'm wondering why their decompilation uses square brackets, yet they require you to use curved brackets to enter a condition. I guess it's because they don't want to use the * pointer operator, or they are sticking to Intel convention.

The fact that the second expression wants to break if a value is greater than 5, suggests they are looking for a byte value at that address. In my case, however. I'm not looking for the value at the READ adress, I'm looking for the address it's reading, which should be [EDX+EAX].

blabberer
November 18th, 2006, 14:41
the power of ida

ida always shows it like that i tried to make it display
mov cl,byte ptr ds:[eax+edi]
but there doesnt seem to be an option like that in ida at least not in ida free some times it makes confusion if it isnt explicit

.text:00401000 000 8A 0C 38 mov cl, [eax+edi]

00401000 >/$ 8A0C38 MOV CL, BYTE PTR DS:[EAX+EDI]

ollydbg on the other hand shows it nicely with size

SiGiNT
November 18th, 2006, 17:24
Ahhh!

So I haven't gone daft - I've never used Sice and I'm unfamiliar with it's conventions, thanx, I was beginnig to question what little I know about assembler.

SiGiNT

The retail versions of Ida don't match olly exactly but it's always clear that it's a pointer.

WaxfordSqueers
November 18th, 2006, 19:07
Quote:
[Originally Posted by sigint33;62505]The retail versions of Ida don't match olly exactly but it's always clear that it's a pointer.


I don't know where I got it from, but I've always thought of anything in square brackets as referencing the address. For example, if I see:

MOV edx, eax

I think of the value already in the EAX register as moving to EDX. If you follow that in softice, and EAX=0401000, then after you execute the MOV instruction, EDX will have 0401000 in it.

If I see:

MOV edx, [eax]

I think of the value at the memory address pointed to by the value in EAX as moving to EDX. So, I'm taking [EAX] as a pointer. I don't know if that's correct, but that's what I do. I very seldom look at the jargon before the brackets, like byte, ptr, etc. Besides, in softice, when the EIP code is highlighted, and an [EAX] is used, you can look at the top right of the screen and see the actual value of [EAX] and what it contains.

In that case, if EAX=0401000, as before, after execution, I would expect to see the value in address 0401000 in the EDX, whatever it was. I go by the size of the register in the destination to know what the size is. In this case, I know a doubleword is being moved because the destination is EDX and the vanilla MOV instruction is used.

If it was a byte being moved, or a word, the MOVZX instruction is often used. This moves a byte, or word (I think), and pads it out with zeros to fill the doubleword EDX register. If 0401000 had the byte FF in it, and the MOVZX instruction was used, after execution, EDX would be 000000FF.

WaxfordSqueers
November 26th, 2006, 21:42
Quote:
[Originally Posted by LLXX;62490]Your RAM could just be failing if it's running at default speed. Try running MemTest86+ on your machine for a few hours.
LLXX...I ran a memory diagnostic from www.simmtester.com. I have memtest86 somewhere, but can't lay my hands on it at the moment. Anyway, this one is pretty good. It does the following tests, from DOS:
March B
March C-
Walk Data 0
Walk Data 1
Walk Address 0
Walk Address 1
Mats+

I have not followed your advice as yet to run it for 24 hours, but it indicated no errors in the testing it did

As I have pointed out, this same memory was used to decompress other files in the same bin file, one which was over 1 gig in size.

LLXX
November 26th, 2006, 22:22
Well, LZ compression can do strange things if the memory is flaky... might work on one file and not on another, since it depends on the pattern of access and contents that are written to the RAM. There is also something called "resonance" that can occur on the bus if a certain sequence of accesses is made; basically the bus becomes a giant oscillator and at the right frequency, the amplitude of the resulting oscillations can flip bits and such.

Like I said earlier, have you tried it on another machine?

WaxfordSqueers
November 26th, 2006, 23:02
Quote:
[Originally Posted by LLXX;62495]This looks like part of the LZ repetition loop (i.e. copies repeated sequences of bytes). You should see a "MOV [...], CL" somewhere below it if my hypothesis is correct.....Take some time and read through some docs on how LZ compression works.
I confirmed your supposition that a MOV [...], CL lay below. Here's the code:

4397EB add [ebp+var_3C], 2

4397EF mov eax, edi
4397F1 sub eax, esi
4397F3 cmp eax, [ebp+var_4C] // =[0x200000]
4397F6 jb short loc_4397FB
4397F8 add eax, [ebp+var_4C] // =[0x200000], eax<0

4397FB mov edx, [ebp+var_48] <---edx = 93DED0 (constant)
4397FE mov cl, [edx+eax] <---access violation here

***note: [eax+edx] has normal range: 9D0000<[eax+edx]<Bxxxxx.
***When error occurs, [eax+edx] = 92FF16. This seems to indicate
***an eax<0, since EDX is always a positive constant.
***the value 0x9D0000 seems to be lower address of dictionary.

439801 mov [ebp+var_19], cl
439804 mov eax, [ebp+var_48]
439807 mov dl, [ebp+var_19]
43980A mov [eax+edi], dl
43980D inc edi
43980E cmp edi, [ebp+var_4C]
439811 jnz short loc_439815 <--end of iteration loop
439813 xor edi, edi <---edi=0 for next loop

***I bpx on this address (439813) for one complete iteration loop.

439815 mov eax, [ebp+var_64]
439818 mov dl, [ebp+var_19]
43981B mov [eax], dl
43981D inc [ebp+var_2C]
439820 inc [ebp+var_64]
439823 dec [ebp+var_3C]
439826 cmp [ebp+var_3C], 0
43982A jle short loc_439834
43982C mov ecx, [ebp+var_2C]
43982F cmp ecx, [ebp+var_C]
439832 jb short loc_4397EF

I've tried several things to find the point at which the access violation occurs, but none of them work. I tried the following conditional statement:

bpx cs:4397FE if (eax+edx)==0x92FF16

The bpx conditional statement slows the computer to a crawl. In the iteration loop, where edi counts from 0 to 0x200000, it takes 5 minutes to process 0x10000 iterations. I tried the following to narrow it down:

BPX 439813

That address is one code step after the end of the 0 to 0x200000 loop counted by edi. The XOR edi, edi is obviously zeroing edi for the next iteration loop. By bpxing on that address, and hitting F5, I jump over the 0x200000 iteration loop. If I count the F5's, it takes 14 of them to get the error message, so I do 13 x F5. Then I apply the:

bpx cs:4397FE if (eax+edx)==0x92FF16

It wont break before the error message.

Here's something else funny, depending on what mood you're in:
If I F5 x 13, and set this command:

BPX 4397FE IF edi>0x10000

where 4397FE = mov cl, [edx+eax], it will break every 0x10000 as edi exceeds the value up to the 0x200000 it counts to. BUT....the error message never appears!!! It exceeds the loop count of 0 to ox200000 several times but the message doesn't appear AND another BPX set after the end of the 0x200000 loop doesn't fire either.

How can that be? With the latter BPX set just after the 0 to 0x200000 loop, the app fires every time. I can count it up to 13 loops by using F5. EVERY time, the message appears on the 14th F5. Yet, when I have the other BPX set, to check for [eax+edx]=0c92ff16, it will exceed the loop several times and never break.

I remember reading that one of the breakpoint statements can be slow, so I tried the BPM conditional statement with:

BPM CS:4397FB RW if eax<0

I was checking on the statement before the access violation to see if eax became negative. When it didn't fire, I did a BL to see if my BPM had been erased. Here's what I found:

BPMB 1:387FB RW DR0 if EAX<0

softice changed my breakpoint on me!!! I thought it might be a confusion with the CS:, so I entered the actual selector value of 1B, and tried again. No luck. It changed it back to the above. What gives??

I also tried a BPMD, but it didn't like that because the address was not on a DWord boundary.

I have considered that maybe I'm using softice incorrectly. I'm using an app that is a shell for the actual decompressor app. It is loaded by CreateProcessA, I have loaded the windows shell program through loader32 and directly. When I select the file to be decompressed in it's filebox window, and start the app decompressing, I immediately break into softice with a ctrl-d.

I arrive in the decompressor's code and have to page down to the code statement I want where the access violation occurs. I highlight the statement with a double-click and hit F5 to bring the EIP up to that point. I have verified that I am in the context of the decompressor because it's name is at the bottom of the code windows. Also, I have loaded the decompressor in IDA and confirmed the code. I always set the breakpoints in that context.

Kayaker
November 27th, 2006, 13:55
Hi Waxford, you asked me to comment on this..
I can't really follow everything you're trying to do here, but I get that you've found a way to break "close" to the right step in the large iteration loop. Then you try to set a bp to hit the error exactly:

4397FE mov cl, [edx+eax] <---access violation here
***When error occurs, [eax+edx] = 92FF16.
bpx cs:4397FE if (eax+edx)==0x92FF16
It wont break before the error message.

Instead of putting the BP on the faulting instruction itself, what happens if you set it earlier and in a different sense. i.e.
bpx 4397FB if eax == FFFxxxxx
where FFFxxxxx is the negative EAX value that offsets 9D0000 and is causing all the problems.

Also, I think it's correct so pardon my mentioning it, but I assume the 'condition' you're trying to hit is really
if (eax+edx)==0x92FF16
and not?
if dword ptr[eax+edx] == 0x92FF16

Re the BPM suddenly looking weird when you try BL. That always happens when you've gone out of context of the original BPM. The address *may* show properly once you're back *in* the proper context, but I've almost NEVER know a BPM to "stick" more than once once you've gone out of context. You almost always have to delete and rewrite the BPM each time.


At this point I'd probably want to know what was the pattern of registers and variables leading up the error. The first few instructions show that EDI, ESI and [ebp+var_4C] are critical and determine the following EAX value. T'was me I might think about writing an Ollyscript (or Olly trace) which would spit out those values over a long number of iterations before the problem actually occurs.

This way you might see where the problem first creeps in, or perhaps give you another conditional breakpoint in another location that might work better. For example, which register inherently causes the problem here, EDI or ESI?
4397EF mov eax, edi
4397F1 sub eax, esi

Kayaker

LLXX
November 27th, 2006, 19:34
Quote:
[Originally Posted by WaxfordSqueers;62642]I tried the following conditional statement:

bpx cs:4397FE if (eax+edx)==0x92FF16

The bpx conditional statement slows the computer to a crawl. In the iteration loop, where edi counts from 0 to 0x200000, it takes 5 minutes to process 0x10000 iterations.
You may wait 2 hours and 40 minutes if you want, but if my suspicions are correct, it'll just end up going the whole way with no exception.
Quote:
Then I apply the:

bpx cs:4397FE if (eax+edx)==0x92FF16

It wont break before the error message.

Here's something else funny, depending on what mood you're in:
If I F5 x 13, and set this command:

BPX 4397FE IF edi>0x10000

where 4397FE = mov cl, [edx+eax], it will break every 0x10000 as edi exceeds the value up to the 0x200000 it counts to. BUT....the error message never appears!!! It exceeds the loop count of 0 to ox200000 several times but the message doesn't appear AND another BPX set after the end of the 0x200000 loop doesn't fire either.

How can that be? With the latter BPX set just after the 0 to 0x200000 loop, the app fires every time. I can count it up to 13 loops by using F5. EVERY time, the message appears on the 14th F5. Yet, when I have the other BPX set, to check for [eax+edx]=0c92ff16, it will exceed the loop several times and never break.
This further reinforces my suspicions. With a conditional BPX set, SoftICE has to evaluate your expression after every instruction, which greatly changes the access pattern made to the RAM.

With a standard non-conditional BPX set, the checking is done in the hardware and the memory access pattern is not interrupted. In other words, the code executes as it would without the presence of a debugger. That's why the access violation occurs during the 14th execution of the loop at full speed. However, when you set the conditional breakpoint, it slows down the execution and changes the access pattern greatly, to the point where it does not cause the error to occur.

As it stands, I'm still quite convinced it's your hardware at fault. Even more certain would be if you could run past that portion and get it to unpack the entire file under the slowed-down execution with a conditional breakpoint in place (I believe that waiting the three hours is justified in this case).

Kayaker
November 27th, 2006, 22:35
Quote:
[Originally Posted by LLXX;62653]With a conditional BPX set, SoftICE has to evaluate your expression after every instruction, which greatly changes the access pattern made to the RAM.


That's very interesting LLXX, also very perceptive. By that analogy then an Ollyscript or Olly conditonal log trace might cause the same effect, it wouldn't break where expected or log the expected errant register values.

It was suggested at first though that the one data file might be corrupted, in which case it's probably not a hardware problem. But if it's a hardware problem, the data file is probably OK.

Just throwing out ideas..I'm not sure if Waxford has hyperthreading, but if it was turned off (or on), or if the installer was run under VMware/VPC, could this also change the RAM access pattern enough so the fault might not occur, or occur during a different iteration?

LLXX
November 28th, 2006, 22:14
Intermittent hardware faults are very "slippery" per se...

Even if you could attach a bus probe to the address and data bus traces, it might change the electrical characteristics of the bus enough to cause the error to occur somewhere else, or even disappear completely. Another example where the observation done to a system causes changes in it...

I'd suggest the best solution would be to wait out those three hours and see if a correctly unpacked file was produced.

WaxfordSqueers
November 29th, 2006, 06:01
Quote:
[Originally Posted by LLXX;62688]Intermittent hardware faults are very "slippery" per se...

Even if you could attach a bus probe to the address and data bus traces, it might change the electrical characteristics of the bus enough to cause the error to occur somewhere else, or even disappear completely. Another example where the observation done to a system causes changes in it...

I'd suggest the best solution would be to wait out those three hours and see if a correctly unpacked file was produced.
Thanks for responding LLXX. Am I detecting a bit of quantum theory here? I am taking your observations seriously, but hardware has been my forte for decades. I was a computer tech for several years, working in the field.

I am theorizing that If I had an intermittent memory cell, then it could do what you say. When you draw current from a device, or more importantly, when you inject a tester into a high frequency device, you can change the parameters enough to make a mysterious (intermittent) fault disappear. There's nothing intermittent about this fault, however, it's solid, and at the same address everytime.

Furthermore, I have tried the decompression under different conditions. Currently, I am using a Windows shell app, which is a wrapper around an LZMA decompressin urility designed specifically to decompress apps created with the Inno installer. It loads the decompression program using CreateProcessA, but I have been able to break into the decompressor using softice. The code I supplied earlier in this post is from the decompressor.

The decompressor is normally a command line app that has to be run from a Windows DOS box. If I run it as such, it breaks with the error in exactly the same place as the shell app, at code address 4397FE and [eax+edx] read address 92FF16. But....if I decompress using the app's own setup.exe file, the error breaks at code address 4585E9 with [eax+edx] = 7A4052.

The app's own setup.exe program probably uses the LZMA decoder in a different manner than the decompression utility I have been using. I can still see your point, in that the difference in read addresses (92FF16 versus 7A4052) could be accounted for by the way the apps load. I am going to swap my two sticks of memory, and hopefully that will eliminate that argument. Also, when I get the time, I will allow the app to run it's course, till it either decompresses the file or throws an error.

I am currently single-stepping through the code, watching the decompressiom begin. That's why I'm using the Windows shell app; it allows me to select only the file with the error, and I don't have to wait each time while it decompresses the other large files.

The decompresor app first loads a 64K (0x10000) chunk from the beginning of the file to be decompressed. It then zeros in on the first 4 bytes, using the SHL command and an OR command to append successive bytes. I mentioned a code area beginning at (9D0000). The app initializes that section with the address of the imported 64K code chunk at 9D0000 and the incremented value at 9D0004. Then, it inserts FFFFFFFF at 9D0008 and the 4 initial bytes of code it took from the beginning of the 64K chunk at 9D000C.

It does some more housekeeping in that area, then it does something odd. It inserts a series of 0x400 words back to back over most of the rest of the area from about 9D0058 on. When it calls one of them it does an SHR 5, converting the 0x400 to 0x20. It uses those values in the decompression. I have read on LZ77 and LZ78 compression teqhniques, but there is nothing mentioned about this approach. I'll have to look closer; right now it seems they are pulling magic numbers out of a hat. I'm sure they are related to the initial compression, however.

Later still, it loads 7 dwords and plays with those. As I said, I need to spend more time understanding what they are doing.

What puzzles me at the moment is how it uses the 0x400 values to create another value it subtracts from the initial 4 bytes. There is also some XORing, ANDing, ORing and the SHL and SHR commands. It's not at all what I expected from reading LZ77/78 theory. I'd appreciate any commentary you'd like to make.

WaxfordSqueers
November 29th, 2006, 07:20
Quote:
[Originally Posted by Kayaker;62649]Hi Waxford, you asked me to comment on this..
I can't really follow everything you're trying to do here, but I get that you've found a way to break "close" to the right step in the large iteration loop. Then you try to set a bp to hit the error exactly:
Thanks Kayaker, and sorry for delay in replying. I know you must be busy and I don't want to distract you helping me specifically. I was interested in your commentary on the softice issue. Thanks for your input on that.

With regard to your suggestion on relocating the BP, here's what I've done:

4397EF mov eax, edi
4397F1 sub eax, esi
4397F3 cmp eax, [ebp+var_4C] // =[0x200000]
4397F6 jb short loc_4397FB
4397F8 add eax, [ebp+var_4C] // =[0x200000], eax<0

4397FB mov edx, [ebp+var_48] <---edx = 93DED0 (constant)
4397FE mov cl, [edx+eax] <---access violation here

I have looked at the value of eax back at 4397F1, where the value in esi is subtracted from it. It is negative after that operation. I have BPXed with a condition on eax at 4397FB such as:

BPX cs:4397FB if eax<0

It did not break. Since [eax+edx] is actually [eax+0x9D3ED0] at that point, the sum has to be greater than 9D0000. I have discovered since last talking to you, that 9D0000 is the area the app initializes with certain values for it's decompression routine. The 0x92FF16 value is too low in that case.

There is a fly in the ointment, however. I hadn't paid too much attention to the

4397F6 jb short loc_4397FB

statement. I have noticed there is another little loop using it, and it jumps over the address:

4397F8 add eax, [ebp+var_4C] (where [ebp+4C]==0x200000)

Normally, that jump is not made. I really have to do a lot more digging to understand what is going on.

It seems to me there can only be two possibilites: either the eax value is out of whack, or the edx value has changed. [eax+edx] is summing to 92FF16, which is below the normal range. I'm just getting into the details of this now, trying to discover where eax and edx originate.

You asked if maybe I was using the wrong expression in my BPX statement. Anything's possible. The app is looking for a byte value at the address pointed to by [eax+edx], to put in the CL register. The curved brackets used by softice are a bit confusing. In the expression:

BPX cs:4397FE if (eax+edx)==0x92FF16 it seems to suggest the sum of eax+edx should equal 0x92FF16. If I used your expression:

if dword ptr[eax+edx] == 0x92FF16

then it seems I am looking for the value at [eax+edx] to be 0x92FF16, which would not be right. I have seen in the softice manual a reference to *(eax+edx). The value at [eax+edx] should be a byte, but it's that byte the app is unable to find at 0x92FF16.

For example, if I use the d (dump) command in softice, like d esp, it dumps the code/data at the address esp. If I use d *esp, it dumps the content pointed to by the pointer at that address, if in fact it is a pointer. In that case, I'm assuming

dword ptr[eax+edx]

would refer to the pointer at address [eax+edx]. You would know a lot more about that than me.

LLXX
November 29th, 2006, 20:37
Quote:
[Originally Posted by WaxfordSqueers;62718]But....if I decompress using the app's own setup.exe file, the error breaks at code address 4585E9 with [eax+edx] = 7A4052.
Ok then it's a pretty safe bet that your compressed data is corrupted, although that case where SoftICE's conditional breakpoint allowed it to go past the point with no error is still a bit puzzling.

Quote:
What puzzles me at the moment is how it uses the 0x400 values to create another value it subtracts from the initial 4 bytes. There is also some XORing, ANDing, ORing and the SHL and SHR commands. It's not at all what I expected from reading LZ77/78 theory. I'd appreciate any commentary you'd like to make.
All LZ implementations are slightly different, but you'll always find a repetition loop which copies the repeated sequences, as you've done so in your OP.

The shifts tend to indicate bit manipulation routines; as this does not look like it was written in Asm, compilers do tend to produce rather florid output even with relatively simple source code, which may obscure the actual intent.

More usefully, find some documentation on the compressed data format and relate the code with its actions on the data.

Kayaker
November 29th, 2006, 22:52
Everything looks OK with the syntax of your conditional bp's Waxford. I don't know if they're not breaking because of some quirk or because the condition really isn't being met, even though it seems like it should be. There shouldn't be a problem with conditional bp's per se.

You could test they are working right, for example it seems you can obviously break on the first of the following 3 instructions at some arbitrary point, doesn't matter which iteration. Find out the value of EAX at 4397EF, then set a conditonal bpx 2 instructions later at 4397F3, using that value of eax. Since you *know* the value of eax, the bp *must* break!!

4397EF mov eax, edi
// if eax = 0xDEADBEEF
// set BPX 4397F3 if eax == 0xDEADBEEF
4397F1 sub eax, esi
4397F3 cmp eax, [ebp+var_4C]


There is one other possibility if you want to go retro and have really come to the end of your ways to attack this. You said you can do this and have the same problem on Win9x. In that case you could use the Backtrace command to log the execution path of a full iteration of the loop or whatever. This might help to figure out what the loop does or where it goes outside of the loop. If you do end up trying that I might suggest looking up and using the TraceDump SoftIce Backtrace Disassembler for this.

WaxfordSqueers
November 29th, 2006, 22:55
Quote:
[Originally Posted by LLXX;62745]Ok then it's a pretty safe bet that your compressed data is corrupted, although that case where SoftICE's conditional breakpoint allowed it to go past the point with no error is still a bit puzzling.
I have since confirmed that after leaving the app to slowly decompress under a softice BPX w/condition, the error message eventually comes up.

It's puzzling to me as well, why it wont break after the 0x200000 count with the conditional BPX set. Obviously, it breaks eventually, and I'm thinking I may have misunderstood the intent of the large loop. I read that in LZW compression, they insert a check to see if the compression is efficient, and if not, they reinitialize the buffer (dictionary) and start again.

I am currently tracing the code in detail using a text file from the compressed binary. There is another loop that counts from 0 to 0x100, and the decompression happens somewhere inside it. There is another loop that keeps tract of 64K chunks (0x10000), and yet another that keeps tract of the file size, subtracting 64K from it during one loop.


Quote:
[Originally Posted by LLXX;62745]The shifts tend to indicate bit manipulation routines; as this does not look like it was written in Asm, compilers do tend to produce rather florid output even with relatively simple source code, which may obscure the actual intent.
From what I understand, which isn't a lot at this time, the decompressor initializes the dictionary with 1's. That could be the 0x400's I saw being inserted in the area above 9D0000. It then outputs the first byte it finds, which should be followed by a code pointing into the dictionary.

I'm going to look more closely for codes. The dictionary 'should' also be initialized from 0 to 255 by the known ASCII codes. So, entries into the dictionary (symbol table) should start at 256 decimal and go to 4096. Unfortunately, they may also be using a hash code for easier access to the dictionary, so the codes probably wont point directly to the exact table entry.

Quote:
[Originally Posted by LLXX;62745]More usefully, find some documentation on the compressed data format and relate the code with its actions on the data.
After an initial search, that seems a little easier said than done. I'm sure there is something out there, but LZMA is a relatively new implementation. I have the SDK with C source code, but it's painstaking for me to follow it through.

TBone
December 6th, 2006, 16:27
Quote:
[Originally Posted by LLXX]Even if you could attach a bus probe to the address and data bus traces, it might change the electrical characteristics of the bus enough to cause the error to occur somewhere else, or even disappear completely. Another example where the observation done to a system causes changes in it...


Quote:
[Originally Posted by WaxfordSqueers;62718]Am I detecting a bit of quantum theory here?

Apropos, I believe they call this sort of thing a Heisenbug.
http://en.wikipedia.org/wiki/Heisenbug

Some protections have code that behaves differently if you're tracing through it than if you're not. The process of tracing or single-stepping through the code alters the speed of execution enough to be (heuristically) detected.

Similarly, anti-virus researchers frequently use virtual machines as sandboxes for analyzing new viruses. Some virus writers take advantage of the sluggish kernel mode operations in a virtual machine to detect the virtualization generically, without knowing what specific VM software is being used. The virus is programmed to "play dead" in a VM environment, making it difficult to do empirical analysis of the virus without infecting a real machine.

JMI
December 6th, 2006, 18:15
Keeping track of the "timing" of certain sections of code is certainly nothing new. I distinctly remember some of my first efforts in tracing mac assembly code with MacNosy, back in the late 1980"s, attempting to figure out PACE protection code for music software. It fairly frequently would use "tickcount" to track how long it took to move through certain sections of the decrypt code, so that it could tell if someone, or something, was stepping through the code, rather than it just plowing through at "normal" speed. Of course, one could eventually usually figure out how to "fudge" the resultant tickcount to fool the check routine.

Also interestingly enough, that protection also used DEADBEEF and BEEFABAD as part of it's checksum routines to attempt to keep someone from changing any of the code. It would start with BEEFABAD and add all the instructions in a code resource together to get a "hash" to commence the decryption of the next code section, and so on and so on. At that time, mac code was in seperate small sections which were only loaded on demand, as needed.

I finally learned enough assembly to figure out how to make it load, decrypt, and then write back to disk, all the decrypted seperate code sections, and then I had the entire thing and could figure out where to remove the crap that was requiring a key disk to start-up. I actually had the key disk and a legitimate copy, I just didn't want to have to put it in the disk drive everytime I started the program. That was how I actually first became interested in reversing.

Regards,

WaxfordSqueers
December 7th, 2006, 00:37
Quote:
[Originally Posted by TBone;62899]Apropos, I believe they call this sort of thing a Heisenbug.
http://en.wikipedia.org/wiki/Heisenbug
Thanks for the reply...thought my thread had died. Got a chuckle out of your URL article. I'm quite familiar with Heisenberg and Schrodinger, although their work is more of a pet peeve to me than anything. There's no denying the brilliance of people at that level of thinking, but it chagrins me to realize reality is being brought down to the level of mathematics. It's kind of scary too; to hear some scientists talk, math is real and the rest is imaginary.

Anyway, I relate to the 'bug' humour, although the bug I am encountering is difficult to locate due to the sheer size of the file. Like I said, one iteration loop is 0x200000 iterations long. The software is operating on 64K chunks.

Quote:
[Originally Posted by TBone;62899]Some protections have code that behaves differently if you're tracing through it than if you're not. The process of tracing or single-stepping through the code alters the speed of execution enough to be (heuristically) detected.
I really don't think this installer has any protection, although I realize the folly in jumping to such conclusions. I have a bit of experience with apps that have such protection and I'm keeping a wary eye out for it.

Quote:
[Originally Posted by TBone;62899]Similarly, anti-virus researchers frequently use virtual machines as sandboxes for analyzing new viruses. Some virus writers take advantage of the sluggish kernel mode operations in a virtual machine to detect the virtualization generically, without knowing what specific VM software is being used. The virus is programmed to "play dead" in a VM environment, making it difficult to do empirical analysis of the virus without infecting a real machine.
that's a good idea, actually. As LLXX was trying to point out, the app may have been behaving differently under a conditional breakpoint due to the way it slows the app down. I did leave it running overnight once to verify that, but it eventually broke with the same error message.

I might try it on vmware to see how it operates there.

WaxfordSqueers
December 7th, 2006, 00:59
Quote:
[Originally Posted by JMI;62905]Keeping track of the "timing" of certain sections of code is certainly nothing new.
Hey JMI, good to hear from you. I guess you're aware of it, but there's a new kid on the block, or relatively new, for checking that kind of timing. The QueryPerformanceCounter function counts the processor clock ticks from the time the processor is started, and I've seen it used to detect single-stepping. What they'll do is test how long it took a single instruction to execute.

I was very suspicious of it recently in an app using DX functions till Silver pointed out that timing is critical in such apps and the use of that function might be normal. That's the problem these days: you can waste hours trying to figure out if you're being detected or not. At least, at my level you can.

Quote:
[Originally Posted by JMI;62905]I finally learned enough assembly to figure out how to make it load, decrypt, and then write back to disk, all the decrypted seperate code sections, and then I had the entire thing and could figure out where to remove the crap that was requiring a key disk to start-up.
that's the fun, isn't it? Once you know your being watched, the mouse becomes the cat.

I've been involved with music software since the mid-80's. One of the first apps I had was a Roland sequencer/score writer on a 5 1/2 inch floppy. It had one of those protections with a track that had one extra sector. I wasn't into reversing in those days.

That app was more interesting in another way. The sequencer and scoring functions were such that you had to shut one down and transfer to the other, even though the main application never left the screen. It took about 30 seconds on an XT, but on the 386 it changed so fasts it was almost undectable. I can only imagine what it would do on my current 2 gig machine.

WaxfordSqueers
December 23rd, 2006, 20:00
I thought I'd wrap up this thread due to the unforgiving size of the file, the lack of information on this particular brand of LZMA compression, and basically a lack of return on time invested. I'm spending more time caught up in logical math than reversing. I wouldn't mind that so much if I had an algorithm to follow, but the lack of info on this compression is significant and it features some exotic hashing techniques on top of the basic compression. Besides, I have several other projects on the go that are far more practical

I've been to the site of the inventor of this compression technique, and he's either evasive about the algorithm or he really doesn't have a good way to explain the software implementation. Several people have asked for the algorithm, since the compression is open source, but the creator does some fancy footstepping around the answers.

Thanks to everyone who replied and tried to help out.