Log in

View Full Version : Byte patching issue


sailor__eda
January 23rd, 2008, 00:07
I'm facing a really strange problem that i wanted to share with everyone.
I'm trying to patch a series of bytes in a linux dynamic library. I'm patching a function call in two locations in identical ways.
Here is a code snippet

.text:084E165B 8B 0D 04 DE D7 09 mov ecx, var1
.text:084E1661 89 0C 24 mov [esp+68h+var_68], ecx
.text:084E1664 E8 E7 3A 00 00 call Func1 <= patching this
.text:084E1669 A3 08 DE D7 09 mov var2, eax
.text:084E166E 8B 1D 08 DE D7 09 mov ebx, var2
.text:084E1674 85 DB test ebx, ebx
.text:084E1676 74 3A jz short loc_84E16B2

Func1 returns the results in eax so I wanted to patch the call to be mov eax, 0h instead.

Hence I was patching E8 E7 3A 00 00 with B8 00 00 00 00.
I fire up my favourite hex editor, look for my byte sequence and make the necessary changes. In fact, I have to do this patch in 2 locations, fairly close to each other.

Here's when the fun starts. If disassemble my modified file after my changes, the 2nd patch location has mov eax, 0h exactly as I intended. The first patch location has mov eax, 383Eh!

I did this several times just to make sure I'm not screwing something up and everytime I have the same problem. In fact, after much trial and error, I realized that whatever immediate value I patched in, seemed to have a offset of 383Eh added to it. I fixed the problem by finding 0h-383Eh and using that value instead and that gave me mov eax, 0h.

So my question is what his happening here? Why is it that making the same change a few bytes below works correctly but not for the first location?
I was thinking it might have something to do with the relocation stuff in a dynamic library but that doesn't make sense for immediate values.

So what gives?

Sailor

dELTA
January 23rd, 2008, 13:23
Just a wild guess here, but maybe something having to do with instruction prefixes? Sometimes they don't show up in the disassembly raw data bytes for instructions, but still modify the behavior of the instruction. Still a little strange behavior though, I agree...

Opcode gurus where are you, anyone?

Polaris
January 23rd, 2008, 14:18
Hmmm... I cannot see anything wrong here... It should work correctly. In fact, from Intel's manual:

B8+ rd MOV r32,imm32 Move imm32 to r32

where +rd is "a register code, from 0 through 7, added to the hexadecimal byte given at the left of the plus sign to form a single opcode byte"... So everything is correct here, as register code for EAX is 0.

dELTA
January 23rd, 2008, 15:58
Hmm, interesting, may it be a bug in IDA or some kind of mistake by yourself sailor__eda? This must be solved!

Please upload both the patched and unpatched version here (that is, if it is a non-commercial library, which would be my guess since it is on Linux, or am I wrong? Otherwise please PM me instead).

sailor__eda
January 23rd, 2008, 21:51
You initially thats what I thought as well, IDA is making an error in the disassembly; but then the program didn't work like it was supposed (i.e. if eax == 0). Only when I did the mov eax, -383Eh did it work.

Sailor_eda

naides
January 23rd, 2008, 22:20
Little experiment: Instead of patching B8 00 00 00 00 mov eax, 0

Patch it to:

xor eax,eax
nop, nop nop, nop

or some variation on the theme:
xor eax,eax
inc eax
dec eax
inc eax
dec eax


same result, different code and see what happens to the nops or inc/decs. . .

Perhaps somewhere else in the code the program is modifying the executable in memory and/or in disk?

sailor__eda
January 24th, 2008, 00:24
Hi Naides,

You know I did exactly what you mention in the first patch and it disassembled into something like
xor eax, XXXh - some number
push XXXYYh - something else
instead of xor eax, eax and nop's

I can send you the file if you're interested, pm me and let me know.

Thanks,

Sailor_eda

dELTA
January 24th, 2008, 05:00
I have not been able to look at the files you sent me yet in IDA or any other more advanced tool, but I quickly diffed them, and there were no patch mistakes as far as I can see.

Judging by the fact that this all happens both in IDA and "in reality" (i.e. the effect is confirmed when running the code) we can safely say that an IDA bug is out of the question anyway.

I'd say everything points to a relocation issue or similar effect, which would also be practically the only thing that would show up both in a static disassembly and during live debugging, and the behavior of a certain constant always being added/subtracted from the original bytes also supports this.

So, why would a "CALL rel32" instruction be "relocated"? I have no deeper detailed knowledge of relocation mechanisms, and even less so in Linux executables, but my guess is that the relative call is corrected/relocated for section alignment issues, and that the other similar call instruction that you mention, which is not modified by IDA and the loader in this way rather references a call within the same section (or similar feature in ELF executables), while this problematic one references a call in another section. Am I right?

If you would somehow be able to enumerate all relocation information in that file, I can bet you'd see a reference to this address.

naides
January 24th, 2008, 05:36
Sure Sailor. Please send me the files via PM.

One more venue to explore: The hex editor: If you patch the original file, save it, and immediately open it (Without running or disassembly)
with the same or another hex editor, are the patched bytes still
text:084E1664 B8 00 00 00 00
?????
I am wondering if the hex editor is not doing something estrange during the patching an Endian problem, I don't know. . .


Question 2: You mention two locations of patching: Do they point to the same function?

I would expect a relocation effect to modify both places, which call the same, relocated function.

dELTA
January 24th, 2008, 05:48
Quote:
[Originally Posted by naides;72195]One more venue to explore: The hex editor: If you patch the original file, save it, and immediately open it (Without running or disassembly)
with the same or another hex editor, are the patched bytes still
text:084E1664 B8 00 00 00 00
?????
As mentioned above, I've confirmed the patch to be correct, yes.


Quote:
[Originally Posted by naides;72195]I am wondering if the hex editor is not doing something estrange during the patching an Endian problem, I don't know. . .
Zero is 00 00 00 00 both in big-endian and small-endian you know. (just kidding, I'm understand that it could just as well theoretically have has messed up something else too, but as mentioned above, I've confirmed it to be correct anyway)


Quote:
[Originally Posted by naides;72195]Question 2: You mention two locations of patching: Do they point to the same function?

I would expect a relocation effect to modify both places, which call the same, relocated function.
If both calls are themselves located in the same section (or otherwise alignable/relocatable/movable area), AND the calls are referencing the same function, then it is strange indeed, yes. Otherwise please see my relocation explanation/hypothesis above.

naides
January 24th, 2008, 09:59
Quote:
[Originally Posted by dELTA;72196]As mentioned above, I've confirmed the patch to be correct, yes.


That makes two people that don't read the prince's posts carefully

sailor__eda
January 25th, 2008, 01:12
I forgot to mention it, but in the files I've sent, only the problem call is patched, just FYI.

Thanks for looking at this, I was wondering if I was going crazy

Sailor_eda

evlncrn8
January 25th, 2008, 01:16
u sure the proc call that you're patching (e8 xx xx xx xx -> b8 xx xx xx xx) takes NO parameters?

naides
January 25th, 2008, 14:08
I took a look at your files.

If you look at this code in IDA view:

Code:
.text:00A3781B mov ecx, dword_140EE64
.text:00A37821 mov [esp+68h+var_68], ecx
.text:00A37824 call Function1 <-- Problem
.text:00A37829 mov dword_140EE68, eax
.text:00A3782E mov ebx, dword_140EE68
.text:00A37834 test ebx, ebx
.text:00A37836 jz short loc_A37872
.text:00A37838 call Function2
.text:00A3783D test eax, eax
.text:00A3783F jnz short loc_A37848


And see the same area in HEX view:

Code:
.text:00A37820 01 89 0C 24 E8 E7 3A 00 00 A3 68 EE 40 01 8B 1D ë$Ft:..úhe@ï
.text:00A37830 68 EE 40 01 85 DB 74 3A E8 33 15 00 00 85 C0 75 he@à¦t:F3§..à+u
.text:00A37840 07 31 D2 E9 B0 FE FF FF 8B 35 64 EE 40 01 89 34 1-T¦¦ ï5de@ë4


You'll see what is expected, E8 rel32 . But looking at the same code region with a hex editor, directly from disk:

00A37820 0189 0C24 E8FC FFFF FFA3 68EE 4001 8B1D ...$......h.@...
00A37830 68EE 4001 85DB 743A E8FC FFFF FF85 C075 h.@...t:.......u
00A37840 0731 D2E9 B0FE FFFF 8B35 64EE 4001 8934 .1.......5d.@..4



You see that the Call instructions in this region do not hold the relative address of the callee function code, but rather a place holder: FFFFFFFC (in little-endian). Dont be intimidated by the Fs, as a signed integer, it is just -4.




Actually all the calls around this code area that point to a named symbol follow a similar pattern: do not have an actual address but the very same place holder, instead of the relative function address.

Looking around for "ELF relocation" I ran into several articles:
http://www.linuxjournal.com/article/1059
http://www.securityfocus.com/infocus/1872

and others. . .


So it seems that IDA and the Linux Loader use the Global offset Table (GOT) and the Procedure Linkage Table (PLT) to resolve symbolic call addresses at load time/disassembly time. In this case it is blindly adding a number to the call "address" position, where the place holder is located.

Non symbolic calls, calls that don't point to a named symbol, either local or external, which are labeled by IDA something like:

call sub_A40336

are treated differently. They indeed follow the familiar format E8 rel32 when examined in both IDA Hex view and in direct disk Hex editor (I checked a few). This means that the call addresses for local calls are determined at compile time/link time which is the the usual for windows programs, instead of at load time.

So patching a call does not seem to be a good idea in ELF libraries, patching a jump or other strategy may work better.

[edit]
Little add-on: (If any one cares) I figured out why the space holder has FFFFFFFC,( -4) in it. At the time in which the loader fills in a relative distance to the space holder, an extra 4 bytes need to be subtracted to point to the beginning of the referred call in the formula:

Relative Offset = Current virtual Address + Symbol Virtual Address Pointer (From the TOC) -4

taking into consideration that the place holder is 4 bytes long. The -4 will in-line the correction.

Polaris
January 25th, 2008, 14:37
Good find Naides, really nice explanation!

blabberer
January 26th, 2008, 13:17
well i havent used ida linux much but try using objdump

for example
this could fetch all the rel calls and thier opcodes

Code:

objdump -d /bin/ls -j .text -M intel | grep 'call' | grep 'e8' | more
80499ac: e8 3f fd ff ff call 0x80496f0
80499b8: e8 00 00 00 00 call 0x80499bd
8049a40: e8 bb 65 fb f7 call 0x0
8049a92: e8 39 fb ff ff call 0x80495d0
8049ab0: e8 db fd ff ff call 0x8049890
8049ad2: e8 f9 fa ff ff call 0x80495d0
8049b00: e8 0b fa ff ff call 0x8049510
8049b71: e8 4a f9 ff ff call 0x80494c0
8049b8f: e8 9c fb ff ff call 0x8049730
8049c04: e8 e7 b7 00 00 call 0x80553f0
8049cb8: e8 03 aa 00 00 call 0x80546c0
8049cdc: e8 2f 7e 00 00 call 0x8051b10
8049d03: e8 68 fb ff ff call 0x8049870
8049d17: e8 44 a9 00 00 call 0x8054660
8049d45: e8 26 fb ff ff call 0x8049870

sailor__eda
January 26th, 2008, 14:44
Hi Naides,

That is some really good work. It all makes sense and I agree, patching calls is probably not a good idea for elf libraries.
Cool. Thanks for the great investigation.

Sailor_eda