Log in

View Full Version : [help]function size


roxaz
July 7th, 2008, 05:33
Im trying to find a way to get a size of certain function. However i got some very weird results =o Current idea im trying to implement is like this:
i have 2 functions one next to other. lets name them func1 and func2. I thought if i &func2 - &func1 = size of func1. However i got 5! i checked addresses in debugger, real size is 0x190. That kinda confused me. Does anyone of you have any idea how could i get size of a func?

yeah, i figured out that there will be curious guys that will ask me 'what for?' ;] im trying to inject my code to another process. and yes, i surely know about dll injection, but i want to be more sneaky ;]

naides
July 7th, 2008, 05:41
Hi roxuz, well come to the Board. Without further detail, this is what I guess is happening: The functions themselves are dynamically allocated in the Heap, which is standard behavior in new compilers. &func2 - &func1 probably gives you the distance between the pointers in memory. I remotely remember this same theme was discussed previously here. try searching the forum. I don't remember any key search word that may be useful.

dELTA
July 7th, 2008, 07:47
How can you acquire function pointers for non-exported functions from another process to begin with?

roxaz
July 7th, 2008, 07:51
those functions are inside my process. my process starts another process. i want to copy some functions from my process to started one.

dELTA
July 7th, 2008, 08:00
Ah, I see. Another trick, if the compiler abstracts the code too much with jump tables etc, as might be the case with your problem, is to insert "signatures" of asm code at the beginning and end of the targeted functions, and then search for these in the entire code section of your program (will still go very fast).

Example of such a signature:

jmp endlabel
dd signature_dword_1
dd signature_dword_2
...
endlabel:

roxaz
July 7th, 2008, 08:11
hehe, great idea, thx so much ;]

OHPen
July 7th, 2008, 10:01
Hi,

as naides said you are probably dealing with jump tables. If you functions inside the binary are really next to each other without displacement or padding stuff then you could simply extract the addresses out of the jmp instruction in table to get the real addresses of the functions.
You wouldn't even need a full disassembler for it, as a simple jump / call detection would be sufficient in this case

Might be another solution, but maybe the signature solution is the more simple one.

Regards,

OHPen

Camus SoNiCo
July 7th, 2008, 10:11
Did u try removing the Incremental Linking from the compiler?

Cheers,
Camus

roxaz
July 7th, 2008, 10:44
great man, it worked perfectly. thank you so much ;]

naides
July 7th, 2008, 12:38
What worked roxaz: OHPen suggestion or Camus Suggestion or dELTA suggestion. . .?

roxaz
July 7th, 2008, 13:03
one by Camus, sorry i forgot to quote ^^

darawk
July 7th, 2008, 14:18
I wrote this code a while back to calculate the length of a function. It only works of course if all of the blocks are contiguous (not necessarily in execution order, but just overall contiguous):

Code:


/************************************************************************
Function length calculation algorithm - by Darawk:

1. Scan the function's code for branches, and record each branch. Stop
upon reaching an end-point*. This group of instructions constitutes
the current "block".
2. QSort the branch list
3. Recursively repeat steps 1 & 2 with each branch, skipping duplicates
and intra-block branches.

*end-point: A ret instruction or an unconditional backwards jump,
that jumps to a previous block.
************************************************************************/

u32 GetFunctionLength(void *begin)
{
void *end = GetFunctionEnd(begin);
u32 delta = (u32)((DWORD_PTR)end - (DWORD_PTR)begin);
delta += mlde32(end);
return delta;
}

void *GetFunctionEnd(void *func)
{
void *block = func;
vector<void *> branchList;
// ptr now points to the end of this block
void *blockend = GetBranchListFromBlock(block, branchList);

// If there are no branches, then return
// the empty list. If we don't have this
// here the loop will crash on an empty
// branch list.
if(branchList.size() == 0) return blockend;

// Sort the list so that we can identify and
// discard, intra-block branches. And optimize
// the removal of duplicates.
std::sort(branchList.begin(), branchList.end());

void *prev = NULL;
vector<void *>::iterator branch;
for(branch = branchList.begin(); branch != branchList.end(); branch++)
{
// Skip branches that jump into a block we've already
// processed.
if(*branch < blockend || *branch == prev)
continue;

blockend = GetFunctionEnd(*branch);
prev = *branch;
}

return blockend;
}

void *GetBranchListFromBlock(void *block, vector<void *> &branchList)
{
u8 *ptr = (u8 *)block;

// If we reach an end-point, then this block is complete
while(!IsEndPoint(ptr, block))
{
// Record all branching instructions that we encounter
void *address = GetBranchAddress(ptr);
if(address)
{
branchList.push_back(address);
}

// Next instruction
ptr += mlde32(ptr);
}

return ptr;
}


void *GetBranchAddress(u8 *instr)
{
s32 offset = 0;
// This code will determine what type of branch it is, and
// determine the address it will branch to.
switch(*instr)
{
case INSTR_SHORTJMP:
case INSTR_RELJCX:
offset = (s32)(*(s8 *)(instr + 1));
offset += 2;
break;
case INSTR_RELJMP:
offset = *(s32 *)(instr + 1);
offset += 5;
break;
case INSTR_NEAR_PREFIX:
if(*(instr + 1) >= INSTR_NEARJCC_BEGIN && *(instr + 1) <= INSTR_NEARJCC_END)
{
offset = *(s32 *)(instr + 2);
offset += 5;
}
break;
default:
// Check to see if it's in the valid range of JCC values.
// e.g. ja, je, jne, jb, etc..
if(*instr >= INSTR_SHORTJCC_BEGIN && *instr <= INSTR_SHORTJCC_END)
{
offset = (s32)*((s8 *)(instr + 1));
offset += 2;
}
break;
}

if(offset == 0) return NULL;
return instr + offset;
}

bool IsEndPoint(u8 *instr, void *curblock)
{
void *address;
s32 offset;
switch(*instr)
{
case INSTR_RET:
case INSTR_RETN:
case INSTR_RETFN:
case INSTR_RETF:
return true;
break;

// The following two checks, look for an instance in which
// an unconditional jump returns us to a previous block,
// thus creating a pseudo-endpoint.
case INSTR_SHORTJMP:
offset = (s32)(*(s8 *)(instr + 1));
address = instr + offset;
if(address <= curblock) return true;
break;
case INSTR_RELJMP:
offset = *(s32 *)(instr + 1);
address = instr + offset;
if(address <= curblock) return true;
break;
default:
return false;
break;
}

return false;
}

roxaz
July 7th, 2008, 14:29
that is peace of artwork ^^ gee, really thx, best method of all. maybe you could add missing defines and tell me what mlde32 func does?

deroko
July 7th, 2008, 14:54
mlde32 is length disassembler engine

http://vx.netlux.org/vx.php?id=em24

Kayaker
July 7th, 2008, 15:22
Nice one deroko darawk. I should have used that algorithm in my IceProbe disassembler. It wasn't needed for the job at hand, but that's a much nicer implementation to generate a complete disasm of a particular function with all the scattered code chunks.

What if the blocks are *not* contiguous, which may be the more common case. For example, 2 or 3 functions may jump to the same shared endpoint return code chunk, therefore that chunk couldn't be contiguous with at least one of the functions.

Would the algo you posted not work in that case, or could it be modified to work?

I knew there was a way to do it, I was just too lazy to figure it out
I may use that idea some day to update the disasm engine I use.

Thanks for the code.

Cheers,
Kayaker

darawk
July 7th, 2008, 15:37
Yea, you could modify it to do that. This algorithm was designed for the purpose of supplying a length to a memcpy() so that I could copy arbitrary functions out of other modules, provided they didn't behave as you described. It would of course be possible to do the same with the more complex type of function that jumps all around the module (such as many in ntdll), but that is a little bit more difficult and wasn't necessary for what I was doing at the time.

roxaz
July 7th, 2008, 16:11
is it so difficult? we should calculate length of every chunk and add them up. sounds quite easy. copying such a func should be hard thou.

P.S. still it would be nice to get defines such as INSTR_RELJMP, INSTR_SHORTJCC_BEGIN, INSTR_SHORTJCC_END and many others ^_^

EDIT:
here ya go roxaz!
Quote:

#define INSTR_NEAR_PREFIX 0x0F
#define INSTR_FARJMP 0x2D // Far jmp prefixed with INSTR_FAR_PREFIX
#define INSTR_SHORTJCC_BEGIN 0x70
#define INSTR_SHORTJCC_END 0x7F
#define INSTR_NEARJCC_BEGIN 0x80 // Near's are prefixed with INSTR_NEAR_PREFIX byte
#define INSTR_NEARJCC_END 0x8F
#define INSTR_RET 0xC2
#define INSTR_RETN 0xC3
#define INSTR_RETFN 0xCA
#define INSTR_RETF 0xCB
#define INSTR_INT3 0xCC
#define INSTR_RELJCX 0xE3
#define INSTR_RELCALL 0xE8
#define INSTR_RELJMP 0xE9
#define INSTR_SHORTJMP 0xEB
#define INSTR_FAR_PREFIX 0xFF

gee, many thx to roxaz ;]]

darawk
July 8th, 2008, 11:50
It shouldn't be difficult at all really. The only reason I didn't do it is because I was thinking too narrowly about the problem I was trying to solve. I wanted to be able to rip an arbitrary function out of any module on the fly, and at the time I wrote this, I shortsightedly limited myself to contiguous functions - but there really is no reason that this couldn't work on non-contiguous ones. All you'd have to do is re-order the blocks and tweak the jmp's to fit the new shape of the function if the function wasn't already contiguous (assuming your goal is copying the function - if it's just counting bytes or instructions, then this isn't necessary).

EDIT: Oh, and here are my original definitions. We even made the same comment, lol.

Code:
#define INSTR_NEAR_PREFIX 0x0F
#define INSTR_SHORTJCC_BEGIN 0x70
#define INSTR_SHORTJCC_END 0x7F
#define INSTR_NEARJCC_BEGIN 0x80 // Near's are prefixed with a 0x0F byte
#define INSTR_NEARJCC_END 0x8F
#define INSTR_RET 0xC2
#define INSTR_RETN 0xC3
#define INSTR_RETFN 0xCA
#define INSTR_RETF 0xCB
#define INSTR_RELJCX 0xE3
#define INSTR_RELJMP 0xE9
#define INSTR_SHORTJMP 0xEB

deroko
July 8th, 2008, 12:36
Quote:
[Originally Posted by Kayaker;75725]Nice one deroko. I should have used that algorithm in my IceProbe disassembler.


Not mine code It's from darawk, I just gave refference for mlde32 which is used in the code

Kayaker
July 8th, 2008, 15:46
Oops, misplaced credit, sorry

homersux
July 27th, 2008, 22:07
Quote:
[Originally Posted by deroko;75722]mlde32 is length disassembler engine

http://vx.netlux.org/vx.php?id=em24


Hello, beautiful code. Do you mind uploading this engine zip file? the link you gave appears dead to me.

H

Kayaker
July 27th, 2008, 22:17
If it's not readily available elsewhere it sounds like a candidate for CRCETL. Wasn't there an XDE engine as well?

deroko
July 28th, 2008, 10:08
@homersux: indeed, neither can I download it from that link, but I found copy of it in 29a e-zine #7, which you may download from : http://vx.org.ua/29a/main.html . Also I've attached mlde32

@Kayaker: yes there is I think it's in 29a #8, but dunno if it's updated version as Zeljko Vrba described a little bug in it here : http://www.phrack.org/issues.html?id=13&issue=63

Quote:

----[ 3.4 - XDE bug


During the development, a I have found a bug in the XDE disassembler
engine: it didn't correctly handle the LOCK (0xF0) prefix. Because of the
bug XDE claimed that 0xF0 is a single-byte instruction. This is the
needed patch to correct the disassembler:

--- xde.c Sun Apr 11 02:52:30 2004
+++ xde_new.c Mon Aug 23 08:49:00 2004
@@ -101,6 +101,8 @@
if (c == 0xF0)
{
if (diza->p_lock != 0) flag |= C_BAD; /* twice */
+ diza->p_lock = c;
+ continue;
}

break;

I also needed to remove __cdecl on functions, a 'feature' of Win32 C
compilers not needed on UNIX platforms.


Fyyre
July 28th, 2008, 18:55
XDE version 1.0.2 fixes the lock prefix bug:

eXtended (XDE) disassembler engine
----------------------------------
version 1.02

History:

1.01 - 1st release
1.02 - lock prefix bug is fixed, thanx to www.core-dump.com.hr

http://vx.netlux.org/vx.php?id=ex01 ("http://vx.netlux.org/vx.php?id=ex01")

homersux
July 28th, 2008, 19:53
Thanks for the update, Fyyre.

H

dELTA
July 29th, 2008, 03:50
Quote:
[Originally Posted by Kayaker;76203]If it's not readily available elsewhere it sounds like a candidate for CRCETL. Wasn't there an XDE engine as well?
Indeed, and that's why I already put it there the first time it was mentioned in this thread.

http://www.woodmann.com/collaborative/tools/Mlde32

And btw, XDE just magically appeared there too:

http://www.woodmann.com/collaborative/tools/EXtended_Disassembler_Engine_%28XDE%29

Interested parties should probably take a look at the entire "X86 Disassembler Libraries" category:

http://www.woodmann.com/collaborative/tools/Category:X86_Disassembler_Libraries

aqrit
September 14th, 2008, 15:31
(1)
Include the code to inject as a binary resource
then just call SizeofResource()

(2)
use "#pragma section(...)" to put the code to inject in its own section
then walk the PE header to determine the size of the section

(3)
Label the first instruction and after the last instruction
Subtract, the two addresses of the labels, to get
the size of code between them.

A label has only function scope, which is a small pain...

Code:
DWORD StartCodeAddress, EndCodeAddress, CodeSize;

void someFunc(){
// C++ can't get the address-of a label??
__asm lea StartCodeAddress, StartLabel
__asm lea EndCodeAddress, EndLabel
CodeSize = EndCodeAddress - StartCodeAddress;

return;

__asm{
StartLabel: pop eax
pop edx
push eax
mov eax,edx // dummy code
mov edx,eax
ret

EndLabel: nop // a label must point at code!
// note that the nop is not included in the CodeSize
}

}

BanMe
October 27th, 2008, 11:18
GetFunctionLength(by darawk) use's of CRT which i am not a big of. so i redid this excellent peice of code without the use of the CRT..(still working of the redoing std::sort() but it works none the less this code also includes mlde32 by uNdErX making compilation much easier..
Code:


#define INSTR_NEAR_PREFIX 0x0F
#define INSTR_SHORTJCC_BEGIN 0x70
#define INSTR_SHORTJCC_END 0x7F
#define INSTR_NEARJCC_BEGIN 0x80 // Near's are prefixed with a 0x0F byte
#define INSTR_NEARJCC_END 0x8F
#define INSTR_RET 0xC2
#define INSTR_RETN 0xC3
#define INSTR_RETFN 0xCA
#define INSTR_RETF 0xCB
#define INSTR_RELJCX 0xE3
#define INSTR_RELJMP 0xE9
#define INSTR_SHORTJMP 0xEB

#define O_UNIQUE 0
#define O_PREFIX 1
#define O_IMM8 2
#define O_IMM16 3
#define O_IMM24 4
#define O_IMM32 5
#define O_IMM48 6
#define O_MODRM 7
#define O_MODRM8 8
#define O_MODRM32 9
#define O_EXTENDED 10
#define O_WEIRD 11
#define O_ERROR 12

typedef struct _TreeTrunk
{
void * MemStart;
DWORD MemTotalSize;
DWORD NumTotalEntry;
DWORD NumEntry;
}TreeTrunk;

void *GetFunctionEnd(void *func);
void *GetBranchListFromBlock(void *block, TreeTrunk *branchList);
void *GetBranchAddress(UCHAR *instr);
bool IsEndPoint(UCHAR *instr, void *curblock);
unsigned int GetFunctionLength(void *begin);
int __cdecl mlde32(void*codeptr);

__declspec(naked)int __cdecl mlde32(void*codeptr)
{
__asm
{
pushad

cld
xor edx, edx

mov esi, [esp+(8*4)+4]
mov ebp, esp

; 256 bytes, index-compressed opcode type table
push 01097F71Ch
push 0F71C6780h
push 017389718h
push 0101CB718h
push 017302C17h
push 018173017h
push 0F715F547h
push 04C103748h
push 0272CE7F7h
push 0F7AC6087h
push 01C121C52h
push 07C10871Ch
push 0201C701Ch
push 04767602Bh
push 020211011h
push 040121625h
push 082872022h
push 047201220h
push 013101419h
push 018271013h
push 028858260h
push 015124045h
push 05016A0C7h
push 028191812h
push 0F2401812h
push 019154127h
push 050F0F011h
mov ecx, 015124710h
push ecx
push 011151247h
push 010111512h
push 047101115h
mov eax, 012472015h
push eax
push eax
push 012471A10h
add cl, 10h
push ecx
sub cl, 20h
push ecx

xor ecx, ecx
dec ecx

; code starts
ps: inc ecx
mov edi, esp
go: lodsb
mov bh, al
ft: mov ah, [edi]
inc edi
shr ah, 4
sub al, ah
jnc ft

mov al, [edi-1]
and al, 0Fh

cmp al, O_ERROR
jnz i7

pop edx
not edx

i7: inc edx
cmp al, O_UNIQUE
jz t_exit

cmp al, O_PREFIX
jz ps

add edi, 51h ;(_ettbl - _ttbl)

cmp al, O_EXTENDED
jz go

mov edi, [ebp+(8*4)+4]

i6: inc edx
cmp al, O_IMM8
jz t_exit
cmp al, O_MODRM
jz t_modrm
cmp al, O_WEIRD
jz t_weird

i5: inc edx
cmp al, O_IMM16
jz t_exit
cmp al, O_MODRM8
jz t_modrm

i4: inc edx
cmp al, O_IMM24
jz t_exit

i3: inc edx
i2: inc edx

pushad
mov al, 66h
repnz scasb
popad
jnz c32

d2: dec edx
dec edx

c32: cmp al, O_MODRM32
jz t_modrm
sub al, O_IMM32
jz t_imm32

i1: inc edx

t_exit:
mov esp, ebp
mov [esp+(7*4)], edx
popad
ret

;*********************************
;* PROCESS THE MOD/RM BYTE *
;* *
;* 7 6 5 3 2 0 *
;* | MOD | Reg/Opcode | R/M | *
;* *
;*********************************
t_modrm:
lodsb
mov ah, al
shr al, 7
jb prmk
jz prm

add dl, 4

pushad
mov al, 67h
repnz scasb
popad
jnz prm

d3: sub dl, 3

dec al
prmk:jnz t_exit
inc edx
inc eax
prm:
and ah, 00000111b

pushad
mov al, 67h
repnz scasb
popad
jz prm67chk

cmp ah, 04h
jz prmsib

cmp ah, 05h
jnz t_exit

prm5chk:
dec al
jz t_exit
i42: add dl, 4
jmp t_exit

prm67chk:
cmp ax, 0600h
jnz t_exit
inc edx
jmp i1

prmsib:
cmp al, 00h
jnz i1
lodsb
and al, 00000111b
sub al, 05h
jnz i1
inc edx
jmp i42

;****************************
;* PROCESS WEIRD OPCODES *
;* *
;* Fucking test (F6h/F7h) *
;* *
;****************************
t_weird:
test byte ptr [esi], 00111000b
jnz t_modrm

mov al, O_MODRM8

shr bh, 1
adc al, 0
jmp i5

;*********************************
;* PROCESS SOME OTHER SHIT *
;* *
;* Fucking mov (A0h/A1h/A2h/A3h) *
;* *
;*********************************
t_imm32:
sub bh, 0A0h

cmp bh, 04h
jae d2

pushad
mov al, 67h
repnz scasb
popad
jnz chk66t

d4: dec edx
dec edx

chk66t:
pushad
mov al, 66h
repnz scasb
popad
jz i1
jnz d2
}
}
unsigned int GetFunctionLength(void *begin)
{
void *end = GetFunctionEnd(begin);
unsigned int delta = (unsigned int)((DWORD_PTR)end - (DWORD_PTR)begin);
delta += mlde32(end);
return delta;
}

void *GetFunctionEnd(void *func)
{
void *block = func;
TreeTrunk Tree;
Tree.MemTotalSize = 256;
Tree.MemStart = HeapAlloc(GetProcessHeap(),HEAP_ZERO_MEMORY,Tree.MemTotalSize);
if(Tree.MemStart != NULL)
{
Tree.MemTotalSize = HeapSize(GetProcessHeap(),0,Tree.MemStart);
if(Tree.MemTotalSize != -1)
{
Tree.NumTotalEntry = Tree.MemTotalSize/4;
Tree.NumEntry = 0;
// ptr now points to the end of this block
void *silenttree = GetBranchListFromBlock(block, &Tree);
if(Tree.NumEntry == 0) return silenttree;
DWORD Prev = NULL;
for(DWORD i = 0; i < Tree.NumEntry; i++)
{
DWORD sz = i * sizeof(DWORD);
DWORD Limb = (*(DWORD*)Tree.MemStart + sz);
// Skip branches that jump into a block we've already
// processed.
if(Limb < *(DWORD*)silenttree || Limb == Prev)
continue;
silenttree = GetFunctionEnd(&Limb);
Prev = Limb;
}
return silenttree;
}
HeapFree(GetProcessHeap,0,Tree.MemStart);
return NULL;
}
return NULL;
}
void *GetBranchListFromBlock(void *block, TreeTrunk *Tree)
{
UCHAR *ptr = (UCHAR *)block;
// If we reach an end-point, then this block is complete
while(!IsEndPoint(ptr, block))
{
// Record all branching instructions that we encounter
void *address = GetBranchAddress(ptr);
if(address)
{
*((DWORD*)Tree->MemStart + (Tree->NumEntry * 4)) = *(DWORD*)address;
Tree->NumEntry++;
}
// Next instruction
ptr += mlde32(ptr);
}
return ptr;
}


void *GetBranchAddress(UCHAR *instr)
{
long offset = 0;
// This code will determine what type of branch it is, and
// determine the address it will branch to.
switch(*instr)
{
case INSTR_SHORTJMP:
case INSTR_RELJCX:
offset = (long)(*(char *)(instr + 1));
offset += 2;
break;
case INSTR_RELJMP:
offset = *(long *)(instr + 1);
offset += 5;
break;
case INSTR_NEAR_PREFIX:
if(*(instr + 1) >= INSTR_NEARJCC_BEGIN && *(instr + 1) <= INSTR_NEARJCC_END)
{
offset = *(long *)(instr + 2);
offset += 5;
}
break;
default:
// Check to see if it's in the valid range of JCC values.
// e.g. ja, je, jne, jb, etc..
if(*instr >= INSTR_SHORTJCC_BEGIN && *instr <= INSTR_SHORTJCC_END)
{
offset = (long)*((char *)(instr + 1));
offset += 2;
}
break;
}

if(offset == 0) return NULL;
return instr + offset;
}

bool IsEndPoint(UCHAR *instr, void *curblock)
{
void *address;
long offset;
switch(*instr)
{
case INSTR_RET:
case INSTR_RETN:
case INSTR_RETFN:
case INSTR_RETF:
return true;
break;

// The following two checks, look for an instance in which
// an unconditional jump returns us to a previous block,
// thus creating a pseudo-endpoint.
case INSTR_SHORTJMP:
offset = (long)(*(char *)(instr + 1));
address = instr + offset;
if(address <= curblock) return true;
break;
case INSTR_RELJMP:
offset = *(long *)(instr + 1);
address = instr + offset;
if(address <= curblock) return true;
break;
default:
return false;
break;
}

return false;
}


regards BanMe

aqrit
November 14th, 2008, 01:56
*edit added section relocation

Code:

#include <windows.h>

#define my_code ".myCode"
#define my_data ".myData"

#pragma section( my_data, read, write ) // must not have execute access
#pragma section( my_code, read, write, execute ) // must have execute

#pragma const_seg( my_data ) // constant data
#pragma data_seg( my_data ) // initialized data
#pragma bss_seg( my_data ) // uninitialized data
#pragma code_seg(my_code) // executable code
#pragma auto_inline( off ) // tell the compiler to not inline functions

int num1;
int __stdcall DoSomething(int num2)
{
int num3;
num3 = num1 + num2;
return num3;
}

#pragma auto_inline( on ) // restore
#pragma bss_seg() // restore to ".bss"
#pragma const_seg() // restore to ".rdata"
#pragma data_seg() // restore to ".data"
#pragma code_seg() // restore to ".text"

// Combine our two Sections
// section name must be 8 bytes or less (including null terminater if you want one)
// LNK4254 warning expected: merged sections have different attributes
#pragma comment(linker, "/merge:.myData=.myCode" // append .myDate to .myCode
#pragma comment(linker, "/merge:.myCode=.Inject"// Rename the merged section
#pragma comment(linker, "/section:.Inject,RWE"// Read, Write, and Execute Access

iPIMAGE_SECTION_HEADER GetSectionHeader( char * szSectionName, HMODULE hModule )
{
PIMAGE_DOS_HEADER dos = (PIMAGE_DOS_HEADER) hModule;
if(dos->e_magic != IMAGE_DOS_SIGNATURE)return NULL;

PIMAGE_NT_HEADERS pe = (PIMAGE_NT_HEADERS)( (ULONG)hModule + dos->e_lfanew );
if(pe->Signature != IMAGE_NT_SIGNATURE)return NULL;
if(!pe->FileHeader.SizeOfOptionalHeader)return NULL;

PIMAGE_SECTION_HEADER section_header = IMAGE_FIRST_SECTION( pe );

for(int i = 0; i < pe->FileHeader.NumberOfSections; i++)
{
if(!strcmp((char *)section_header[I].Name,szSectionName))
{
return &section_header[I];
}
}
return NULL;
}

// use the relocation data in the PE header to patch our code so it will
// work when loaded at any memory address
// *section must have read/write access
// *relocations must not be stripped
bool FixUpSection(PIMAGE_SECTION_HEADER pSectionHeader, DWORD dwNewBase, HMODULE hTarget)
{
bool bRet = false;

PIMAGE_DOS_HEADER dos = (PIMAGE_DOS_HEADER) hTarget;
if(dos->e_magic != IMAGE_DOS_SIGNATURE)return false;

PIMAGE_NT_HEADERS pe = (PIMAGE_NT_HEADERS)( (ULONG)hTarget + dos->e_lfanew );
if(pe->Signature != IMAGE_NT_SIGNATURE)return false;
if(!pe->FileHeader.SizeOfOptionalHeader)return false;

PIMAGE_DATA_DIRECTORY relocdir = (PIMAGE_DATA_DIRECTORY)
( pe->OptionalHeader.DataDirectory + IMAGE_DIRECTORY_ENTRY_BASERELOC );
if(!relocdir->Size)return false; // if module has no relocations

PIMAGE_BASE_RELOCATION reloc = (PIMAGE_BASE_RELOCATION)
((DWORD)hTarget + relocdir->VirtualAddress);

// find the reloc chunk(s) that correspond to our section
while(reloc->VirtualAddress)
{
if(! (reloc->VirtualAddress >
pSectionHeader->VirtualAddress + pSectionHeader->SizeOfRawData))
{
if(! (reloc->VirtualAddress < pSectionHeader->VirtualAddress) )
{
/* preform fixup from relocation data */
DWORD numRelocEntries = ( reloc->SizeOfBlock -
sizeof(IMAGE_BASE_RELOCATION)) / sizeof(WORD);

WORD * pRelEntry = (WORD *)((DWORD)reloc + sizeof(IMAGE_BASE_RELOCATION));

DWORD delta = dwNewBase - ((DWORD)pe->OptionalHeader.ImageBase +
(DWORD)pSectionHeader->VirtualAddress);

for(unsigned int i = 0; i < numRelocEntries; i++ )
{
if( ! (( pRelEntry[I] >> 12 ) == IMAGE_REL_BASED_HIGHLOW ) )
continue;

// get location to patch
PULONG CodeLoc = (DWORD *)(pe->OptionalHeader.ImageBase + reloc->VirtualAddress
+ (pRelEntry[I] & 0x0FFF));

// patch location
*CodeLoc += delta;

bRet = true;
}
}
}
reloc = (PIMAGE_BASE_RELOCATION)((DWORD)reloc + reloc->SizeOfBlock);
}
return bRet;
}

int WINAPI WinMain( HINSTANCE hInst, HINSTANCE hPI, char * CmdLine, int nCmdShow )
{
DWORD dwNewBase; // location to inject to
HANDLE hProcess; // target process
DWORD dwPID = 0; // target process id

// todo: get pid
dwPID = GetCurrentProcessId();

if( !(hProcess = OpenProcess(PROCESS_VM_WRITE | PROCESS_QUERY_INFORMATION |
PROCESS_VM_OPERATION,FALSE, dwPID)))return 0;

PIMAGE_SECTION_HEADER psh = GetSectionHeader( ".Inject", hInst );

dwNewBase = (DWORD)VirtualAllocEx(hProcess, NULL, psh->SizeOfRawData,
MEM_COMMIT, PAGE_EXECUTE_READWRITE);

FixUpSection(psh, dwNewBase, hInst);

WriteProcessMemory(hProcess,(LPVOID)dwNewBase,
&DoSomething, psh->SizeOfRawData,NULL);

CloseHandle(hProcess);
return 0;
}