Log in

View Full Version : easy printf reversen


XFlorian
January 17th, 2005, 07:49
i have compilied a easy printf under linux
gcc print.c -o print


void main ()
{
printf("A test";
}


Than I decompiet it with IDA PRO .
.text:08048384 public main
.text:08048384 main proc near ; DATA XREF: _start+17o
.text:08048384
.text:08048384 var_8 = dword ptr -8
.text:08048384
.text:08048384 push ebp ;
.text:08048385 mov ebp, esp ; ESP =EBP
.text:08048387 sub esp, 8 ;
.text:0804838A and esp, 0FFFFFFF0h // What is done here?
.text:0804838D mov eax, 0 //Why is EAX =0

.text:08048392 sub esp, eax // Why ist ESP - EAX
.text:08048394 mov [esp+8+var_8], offset aEinTestMitEinf ; "A test".
text:0804839B call _printf
.text:080483A0 leave
.text:080483A1 retn
.text:080483A1 main endp

My questions are above

__do_global_dtors_aux proc near
frame_dummy
I didn't post the whole IDA, when you wish that I will.


Is there a tutorial or a book to this topic?

THX

lifewire
January 17th, 2005, 08:57
imho it is weird code indeed, not the usual HLL stuff. it just looks (or better "feels" odd. but well, what I can make of it: the and aligns ESP on 16 bytes, and about the other stuff and more in general: I think that it has something to do with the varying number of arguments that is passed to a printf, with va_list and other nasty stuff. It's just a guess though.

andrewg
January 17th, 2005, 09:14
The best method of understanding various compilers output would have to be sitting down and trying various loops, calling procedures, how they do stack stuff, etc, and then doing it again at different optimisation levels.

Some things are more file format specific. For example, the name __do_global_dtors_aux suggests it does global dtors destruction. A documentation on the ELF specs can be found at http://www.x86.org/ftp/manuals/tools/elf.pdf.

GCC has had, imo, various bugs where it would do spurious things, such as mov eax, 0; sub esp, eax, as that could be optimised out imo. Another case where I've seen some bugs is how it would allocate space for some buffers, and how it vary.... (which makes it annoying when you consider I was trying to play around with various off-by-ones and playing with exploiting stuff.)

blabberer
January 17th, 2005, 13:09
actually gcc produces code which is imho is not secure enough too
for example i compiled one of the c code in andrewgs site with bcc free comand line tools removing linux dependent code and substituting printf() in thier place i see bcc generates an entirely different output where it absolutely avoids the possibility of exploit
bcc uses registers whereas gcc used stack
stack is writable but register isnt writable unless i get to set a
seh and get the context and probably modify the register
was the only way i could come up with

0xf001
January 17th, 2005, 21:53
hi XFlorian,

.text:08048387 sub esp, 8 ;
.text:0804838A and esp, 0FFFFFFF0h // What is done here?

indeed here the stack is aligned to the next 16 byte boundary (-8 as stack grows "downwards" in memory).

text:0804838D mov eax, 0 //Why is EAX =0
.text:08048392 sub esp, eax // Why ist ESP - EAX

I tried several things like using 0 or more local variables for main and came to the conclusion: this 2 instructions will always be there (except optimization turned on).
Now I think this code comes initially from the request to align the stack
( ptth://gcc.gnu.org/ml/gcc-bugs/2001-08/msg00139.html ), and as the start of the code in main() starts at byte 16 then, this seems to be another alignment issue. Maybe when using inline assembly for having an aligned code start.
And from a coders perspective you have the easy possibility to move your stack around by patching the zeros haha

__do_global_dtors_aux proc near
frame_dummy

those are called by the _init / _fini functions, which are inserted into every gcc compiled executable by the c runtime. the _init function calls ie the __do_global_ctors_aux, frame_dummy functions which means run the "global constructors" to initialize the global data (set up the environment for main: argv, argc, env, stackframe), and is therefore executed before main(). the __do_global_dtors_aux is the opposite and ran at the very end to cleanup.

Now when I looked at my gcc output, I realized that subsequent function calls are also provided with an aligned stack:

My gdb gave the output for your prog:
Code:

08048336 <main>:
8048336: 55 push ebp
8048337: 89 e5 mov ebp,esp
8048339: 83 ec 08 sub esp,0x8
804833c: 83 e4 f0 and esp,0xfffffff0
804833f: b8 00 00 00 00 mov eax,0x0
8048344: 29 c4 sub esp,eax
8048346: 83 ec 0c sub esp,0xc ; <==== 0xc
8048349: 6a 01 push 0x1 ; <==== + 0x4 = 0x10
804834b: e8 dc ff ff ff call 804832c <func0>
8048350: 83 c4 10 add esp,0x10
8048353: c9 leave
8048354: c3 ret


you see the sub esp, 0xc there. together with the push, the esp is exactly one boundary.

when using 2 or three parameters the output shows it clearly:
Code:

08048336 <main>:
8048336: 55 push ebp
8048337: 89 e5 mov ebp,esp
8048339: 83 ec 08 sub esp,0x8
804833c: 83 e4 f0 and esp,0xfffffff0
804833f: b8 00 00 00 00 mov eax,0x0
8048344: 29 c4 sub esp,eax
8048346: 83 ec 08 sub esp,0x8 ; <===== 0x8
8048349: 6a 02 push 0x2 ; <===== + 0x4
804834b: 6a 01 push 0x1 ; <===== + 0x4 = 0x10
804834d: e8 da ff ff ff call 804832c <func0>
8048352: 83 c4 10 add esp,0x10
8048355: c9 leave
8048356: c3 ret

or
Code:

08048336 <main>:
8048336: 55 push ebp
8048337: 89 e5 mov ebp,esp
8048339: 83 ec 08 sub esp,0x8
804833c: 83 e4 f0 and esp,0xfffffff0
804833f: b8 00 00 00 00 mov eax,0x0
8048344: 29 c4 sub esp,eax
8048346: 83 ec 04 sub esp,0x4 ; <===== 0x4
8048349: 6a 03 push 0x3 ; <===== + 0x4
804834b: 6a 02 push 0x2 ; <===== + 0x4
804834d: 6a 01 push 0x1 ; <===== + 0x4 = 0x10
804834f: e8 d8 ff ff ff call 804832c <func0>
8048354: 83 c4 10 add esp,0x10
8048357: c9 leave
8048358: c3 ret

so there seems to be a lot of alignment stuff in gcc

cheers, 0xf001

SiNTAX
January 18th, 2005, 05:35
Few remarks on this:

- gcc -S is useful too.. compiles to an asm file (filename.s)
- without -O2 (or -O) the output can be quite stupid
- gcc -g for debug info and then 'objdump -d' to disassemble can be handy too
- code compiled with -mregparm=X will use registers for function calling instead of stack (for i386.. x86_64 defaults to register calling, ie 6 args in regs, rest on the stack)


And last but not least.. the linux tools (objdump/gas/...) use AT&T style syntax, whereas IDA (and most windows tools) use Intel style.
In AT&T style: src, dst
In Intel style: dst, src

stanks
January 22nd, 2005, 13:08
Quote:
[Originally Posted by SiNTAX]Few remarks on this:

- gcc -S is useful too.. compiles to an asm file (filename.s)
...

Add -masm=intel and you will have intel syntax in asm file

stanks

SiNTAX
January 23rd, 2005, 09:50
To be honest.. having spend most of my asm days in Motorola 680x0 assembler (amiga), I always prefered the Motorola/AT&T syntax (left to right).