Log in

View Full Version : Watermarking by linking order


niaren
December 8th, 2010, 15:41
Inspired from this thread
http://www.woodmann.com/forum/showthread.php?13913-Watermarking-application&p=88531#post88531

and in particular from the contents of this post

Quote:
...Others can correct me if I am wrong here but I believe what IDA does on top of what others have said is change the linker order of it's various object files during the linking stage.

For example if the compile process ended up with the following objects

file1.o, file2.o, file3.o

You could change the order they are linked together giving and individualised watermark, now imagine doing that with hundreds of object files that IDA is most likely to have you would have loads of combinations you can use.

And personally I don't think it's an easy task to remove since you would need to move the order of the linked in objects to alter the watermark which means relative addresses within the program would need to be updated.


a mini project is proposed to study how to reverse/defeat/handle this (clever) way of creating a watermark. As is mentioned in the above post it may not be easy to reorder the objects/functions in the executable (.exe/.dll) because addresses then points to wrong locations. It turns out that IDA and its scripting functionality (IDC) may be used to achieve the reordering without having to go make a BIG project. This is a mini-project
With IDA and IDC the reordering can be automized which is quite convenient because for applications with many object files it may not be safe to just reorder a subset of the object files. It would be more safe to create a whole new watermark/permutation of all object files.
This mini-project is just as much a project about getting hands-on experience with IDC and having fun

In order to get started I have created a toy-application. All the application does is to print two strings.

Code:

main.c

extern void func1object1();
extern void func1object2();

void main()
{
func1object1();
func1object2();
}

file1.c

#include <stdio.h>

void func1object1()
{
printf("Hello from object 1!\n";
}

file2.c


#include <stdio.h>

void func1object2()
{
printf("Hello from object 2!\n";
}



From these 3 very simple files two applications are built, the only difference being that the linking order of the object files is different. This makefile

Code:

SRCS = main.c file1.c file2.c

OBJS1 = main.obj file1.obj file2.obj
OBJS2 = file2.obj file1.obj main.obj

CC = CL
CCFLAGS = /O2 /Oi /D "_MBCS" /FD /EHsc /MD /Gy /W3 /c /Zi /TC


LINK = link
LINKFLAGS1 = "/OUT:watermark1.exe" "/MANIFESTUAC:level='asInvoker' uiAccess='false'" /OPT:REF /OPT:ICF /DYNAMICBASE /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib
LINKFLAGS2 = "/OUT:watermark2.exe" "/MANIFESTUAC:level='asInvoker' uiAccess='false'" /OPT:REF /OPT:ICF /DYNAMICBASE /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib

EC = echo
RM = del

default: all


clean:
@$(RM) /F *.obj
@$(RM) /F *.idb
@$(RM) /F *.pdb
@$(RM) /F *.exe
@$(RM) /F *manifest*

%.obj : %.c
"C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat"
@$(EC) ************************************************
@$(EC) * Comiling $@
$(CC) $(CCFLAGS) $<

watermark1.exe: $(OBJS1)
"C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat"
$(LINK) $(LINKFLAGS1) $(OBJS1)
$(LINK) $(LINKFLAGS2) $(OBJS2)

all: watermark1.exe


creates the two .exe files watermark1.exe and watermark2.exe. Attached a zip file containing all the files.
Maybe not surprisingly, for this example, the order of the objects in the binary corresponds to the order in which they are listed in the linker command. The idea is to create watermark2.exe from watermark1.exe.
I hope this example is not too simple. Maybe it will be much harder with c++ code, have no idea. I'm not sure if it is possible to identify the objects themselves but the functions can be identified (by IDA) and IDC (as far as I understand now) provides functionality for jumping to specified functions or just the next function in the code given som virtual address.

Does this make any sense at all?

dELTA
December 13th, 2010, 12:30
Nice introduction and starting documentation, I'm looking much forward to see your progress in this project.

And yes, it makes sense indeed.

niaren
December 14th, 2010, 15:48
Thanks for the encouragement

Just came back to this mini-project after I let myself be interrupted by a crackme (my first .NET reversing) and that crackme was driving me nuts. I had virtually the complete source code (dotfuscated) and I couldn't solve it anyway!? It was quite a frustrating struggle you can imagine

Anyway, have just written and run my first IDC script. The script is basically a copy of an example in this book http://www.idabook.com/ p. 268.

The script enumerates the, by IDA, identified functions. The script looks like this:

Code:

#include <idc.idc> // Mandatory include directive

static main()
{
// Step one, enumerate/list functions
GetFunctions();
}

static GetFunctions()
{
auto addr, name;
addr = 0;
for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
{
name = Name(addr);
Message("Function: %s at %x\n", name, addr);
}
}


When run on watermark1.exe it produces the following output (before you read on guess how many functions IDA finds? ):

Code:

Compiling file 'C:\rce\LinkOrder\linkorder.idc'...
Executing function 'main'...
Function: _main at 401000
Function: sub_401010 at 401010
Function: sub_401020 at 401020
Function: _pre_cpp_init at 40102d
Function: ___tmainCRTStartup at 401078
Function: $LN31 at 4011ee
Function: start at 4012cf
Function: ?__CxxUnhandledExceptionFilter@@YGJPAU_EXCEPTION_POINTERS@@@Z at 4012d9
Function: $LN5 at 40131b
Function: _amsg_exit at 40132a
Function: __onexit at 401330
Function: $LN8 at 4013cc
Function: _atexit at 4013d5
Function: sub_4013EC at 4013ec
Function: sub_401412 at 401412
Function: _XcptFilter at 401438
Function: __ValidateImageBase at 401440
Function: __FindPESection at 401480
Function: __IsNonwritableInCurrentImage at 4014d0
Function: _initterm at 40158e
Function: _initterm_e at 401594
Function: __SEH_prolog4 at 40159c
Function: __SEH_epilog4 at 4015e1
Function: __except_handler4 at 4015f5
Function: __setdefaultprecision at 40161a
Function: sub_401645 at 401645
Function: ___security_init_cookie at 401648
Function: ?terminate@@YAXXZ at 4016de
Function: _unlock at 4016e4
Function: __dllonexit at 4016ea
Function: _lock at 4016f0
Function: sub_4016F6 at 4016f6
Function: _except_handler4_common at 401706
Function: _invoke_watson at 40170c
Function: _controlfp_s at 401712
Function: ___report_gsfailure at 401718
Function: _crt_debugger_hook at 40181e


We wrote 3 simple functions but IDA identifies 37!
It is not clear, at least not to me at this point, whether these extra functions can be filtered out or neglected for the reordering. At this stage they are neglected. Another thing that is not considered yet is whether the data is part of the watermark. Right now only the functions are considered.

I'm going to read some more to find out which IDA functions that can be used for the reordering of the functions and what data structure supported by IDA can be used for saving the functions into as preparation for the actual reordering.

dELTA
December 15th, 2010, 13:35
The other functions that were detected are most likely just standard library functions of the compiler/linker. You can see that IDA even identified a majority of them from its standard signatures.

If I were you I'd ignore those in the first stage of this project (some of them could have some quite annoying optimizations that will make trouble at the beginning of a project like this), and first only focus on your own functions (they will most likely be adjacent in the binary, and thus possible to rearrange independently of the library functions).

Btw, at a later stage you should probably take a look at import table reordering too, since this is a very simple and efficient way to watermark an exe file.

Kayaker
December 15th, 2010, 16:56
Boy, doesn't that illustrate the simple beauty of a program coded in ASM?

I created MAP files of both exe's and compared them with UltraEdit/Text Compare. The only differences recorded were the following:

Code:

watermark1:

0001:00000000 _main
0001:00000020 sub_401020
0002:000000E0 aHelloFromObject2


watermark2:

0001:00000000 sub_401000
0001:00000020 _main
0002:000000E0 aHelloFromObject1



In this "simple" case, we only have to worry about 3 procs, 401000, 401010 and 401020. The middle proc doesn't change, but if we were to swap the 1st and 3rd it could affect it's alignment. In this particular case the number of bytes in the 1st and 3rd proc are the same so we can ignore the middle one, but even this shows how difficult fixing this up would be.

I'm just thinking out loud here.. Let's say one devises a script to swap procs 1 and 3 (having determined that that's the strategy needed) and also fixes up the jump/call relative addresses. But add a small layer of complexity, i.e. say the next time procs 1 and 3 are of *different* byte lengths.. that means we also have to deal with moving/fixing proc 2 as well.

Add a few more 'watermark' functions, different sizes, scattered all over a large amount of code, and now it just gets nasty to contemplate.

I'm curious now how an IDC script to fix the simplest scenarios might fare with a more complex one.

Simple:
swap 2 identified procs of the same size - no functions in between are affected
fix up relative jump/call addresses
done?

Not as simple:
swap 2 identified procs of *different* size - all functions in between are affected
fix up relative jump/call addresses of *all* affected code
done?

Crazy:
swap around many procs of varying sizes, fixing up all affected code
?improbable?

I suppose the other thing too is, understanding how the watermarks are checked. CRC check of only specific watermark functions? Maybe not all the code needs to be handled. Might'nt the watermark-check-code be the weak link in all this if the goal is to "crack" such a protection?


Kayaker

dELTA
December 15th, 2010, 21:18
Glad to have you in the discussion Kayaker.


Quote:
[Originally Posted by Kayaker;88591]In this "simple" case, we only have to worry about 3 procs, 401000, 401010 and 401020. The middle proc doesn't change, but if we were to swap the 1st and 3rd it could affect it's alignment. In this particular case the number of bytes in the 1st and 3rd proc are the same so we can ignore the middle one, but even this shows how difficult fixing this up would be.
Yes, my viewpoint from the start has been that you must be prepared to move around all functions in the executable for a procedure like this, exactly because of such alignment problems combined with the fact that very few functions will be of the exact same size, and thus not "switchable in-place".


Quote:
[Originally Posted by Kayaker;88591]Add a few more 'watermark' functions, different sizes, scattered all over a large amount of code, and now it just gets nasty to contemplate.
As long as you have generic code to relocate a function to any position, why would it really be so much worse to move them all around than to move just a few? I'm sure the computer won't complain too much about one for loop being iterated a few more times? The only possible problem I can think of that increases with the number of simultaneously relocated functions it that there might be functions that are "harder to relocate" (due to crazy compiler optimizations or dynamic address resolutions of different kinds, that IDA therefore won't catch when analyzing/decompiling it). Other than that, am I missing something?


Quote:
[Originally Posted by Kayaker;88591]I'm curious now how an IDC script to fix the simplest scenarios might fare with a more complex one.

Simple:
swap 2 identified procs of the same size - no functions in between are affected
fix up relative jump/call addresses
done?

Not as simple:
swap 2 identified procs of the *different* size - all functions in between are affected
fix up relative jump/call addresses of *all* affected code
done?

Crazy:
swap around many procs of varying sizes, fixing up all affected code
?improbable?
Again, as long as the "simple script" doesn't have hardcoded addresses for some special program or something stupid like that, and with my special reservations above, I can't really see the problem, neither coding-complexity wise or execution time-complexity wise? Please, tell me what I'm missing, oh great god of the kayak!

Quote:
[Originally Posted by Kayaker;88591]I suppose the other thing too is, understanding how the watermarks are checked. CRC check of only specific watermark functions? Maybe not all the code needs to be handled. Might'nt the watermark-check-code be the weak link in all this if the goal is to "crack" such a protection?
First of all, there is one VERY big and important difference between CRC checks and watermarks, which is also exactly what makes watermarks such a pain in the ass. CRC checks are performed by the application itself, and can therefore, just as you say, be easily found, reversed and/or neutralized. The problem with watermarks is that the checking code is contained in a completely separate program, locked into a safe (or ok, most likely in a crappy unpatched Windows server, but anyway ) inside the premises of the software author, only to be taken out and used locally at their office when the same software author finds a leaked/warezed version of their software on the net, in order to be able to subsequently sue the crap out of the person that the watermark reveals to be the source of the leak. Thus, no checking code is available for our analysis (unless you offer to burglarize the the IDA Pro offices and steal it of course, which I'm sure would make you quite popular around lots of people here ), and thus, each and every bit of information inside the executable could potentially be part of a secret watermark, cleverly steganographed into functionally important parts of the applications. So, contrary to the common solution for removing a CRC check in a program (patching the check, or in more rare cases reversing the CRC algo and adapting the patch data to result in the same checksum), the only way to "remove" watermarks is to mess up the binary file in each and every way and dimension that you think information might be implicitly stored to form part of the watermark, while still keeping it fully functional, and that's why we're here today!

As mentioned in the thread referenced at the top if this thread, there is apparently rumours saying that e.g. IDA Pro uses the linking order of its object files to create one (out of many?) such watermark entropy pieces for IDA Pro copies, and thus, the idea of this mini project was born, and its primary scope of investigating how easy it would be to re-shuffle all the functions in an arbitrary executable, in order to create a generic "crack" for exactly that specific type of watermarking technology.

Future (and probably well-needed in order to reach practical result) steps in the "creation of the ultimate generic watermark defeater tool" would probably be a similar (but comparatively more simple) import table shuffler, export table shuffler, relocation table shuffler, PE resource shuffler, and code-location-independent function and data area diffing tool, which checks for any differences within functions that are not related to their location (and thus neglecting different call and jump addresses inside their code related to that), e.g. to see if there are any differences in used instructions in sub areas of functions, differences in data ordering, or tracking data in PE headers or code caves.

This mini project is both a great first step and a very good mini project though! Well, until you answer my questions above and tell me it's impossible, but anyway.

Kayaker
December 15th, 2010, 22:27
Thanks for clarifying watermarking dELTA. I understood that it was to match a particular compilation to a particular person (so they might get the crap sued out of them as you say), but I was also envisioning it as being used as part of a "normal" protection scheme as well, which I guess doesn't necessarily have to be the case and obviously not part of this project.

i.e. as a particular key file will only work with a particular compilation because the linking order is taken into account. In other words, the linking order fingerprint is embedded in the key file and some algorithm is used with it to verify the integrity of the program. (the CRC check comment was a simplistic example of that idea)

If that's not the case, then what's the benefit of removing such a watermark? If I've got IDA and I'm able to steal YOUR IDA, then I can swap watermarks and YOU get blamed for the release, is that it?


Quote:

This mini project is both a great first step and a very good mini project though! Well, until you answer my questions above and tell me it's impossible, but anyway.


No, actually I do have hope, that's why I said "I'm curious now how an IDC script to fix the simplest scenarios might fare with a more complex one."
If you can reorder one function, in theory you should be able to reorder them all. In theory. That's the caveat that still needs to be addressed.


This reminded me of a paper I had posted before
http://www.woodmann.com/forum/showthread.php?9483-Article-Software-Security-Through-Targetted-Diversification

Software Security Through Targetted Diversification
http://www.cosic.esat.kuleuven.be/publications/thesis-122.pdf

The paper is a thesis which discusses the idea of creating software which is distributed as polymorphised versions, in an effort to discourage automated or generic cracking of it. Specifically it suggests the use of Genetic Algorithm (GA) programming to create a diverse population of software for distribution to the masses.


This suggests GA could also be used to create individualised programs. Change a few parameters, fitness/crossover values, record some unique aspect of the offspring (compiled program), and give it to its adopted parent (registered owner). If you find it outside of its new home (leaked), do a DNA analysis.

niaren
December 16th, 2010, 15:01
Quote:

The other functions that were detected are most likely just standard library functions of the compiler/linker. You can see that IDA even identified a majority of them from its standard signatures.

I was thinking the same thing. That they are appended and as such appear last in image but this is just an assumption for now

Kayaker, did you create those MAP files in IDA? (File->Produce File->Create MAP file...) I didn't think of creating MAP files, maybe because the files are so simple. Thanks for the tip

About the length of the functions, then my assumption is that we deal with one continuous block of functions and alignment data and in general all functions are moved. In this case the length of the functions does not matter when we do the reordering, I think. If the watermark is scattered in several distinct areas with stuff in between that is not part of the watermark then this complicates things as length of the functions matters. The idea when starting the mini-project was to make things as simple as possible to begin with and understand how to deal with this. Then we can make things more complicated along the way. For now the simple scenario is challenging enough for me

Personally, I think this linking-order approach to watermarking is quite clever, mostly because I believe it is very practical (low-cost). There is no need for extra tools or going to write any assembler. All that is needed is a couple of additional lines in an already existing build system. So basically there is no extra work to be done by the software writer if the build system and version control system is already set. And I also think the watermark is not so easy to remove, but that is what we are hoping to find out

The IDC script has been expanded a little so that it actually takes care of reordering the functions.

Current IDC script
Code:
#include <idc.idc> // Mandatory include directive

static EnumerateAndStoreFunctions(hfunctionnames)
{
auto addr, tmpaddr, name, fidx, widx, bsuccess, tmphandle, inextfunction;
addr = 0;
fidx = 0; // function index
widx = 0; // word idx
for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
{
name = Name(addr);

// Stop if name of function is _pre_cpp_init
// It is assumed that compiler/linker generated functions
// are appended in the end of image and that they start with the
// _pre_cpp_init function
if(name == "_pre_cpp_init"
{
return fidx;
}

bsuccess = SetArrayString(hfunctionnames, 2*fidx, name);
if(bsuccess == 0)
{
Message("Saving name of function %s failed.",name);
}

tmphandle = CreateArray(name);
if(tmphandle == -1)
{
tmphandle = GetArrayId(name);
}

inextfunction = NextFunction(addr);
if(inextfunction == BADADDR)
{
inextfunction = GetFunctionAttr(addr, FUNCATTR_END);
}

widx = 0;
for(tmpaddr = addr; tmpaddr < inextfunction; tmpaddr = tmpaddr + 4)
{
SetArrayLong(tmphandle, widx, Dword(tmpaddr));
widx = widx + 1;
}
bsuccess = SetArrayLong(hfunctionnames, 2*fidx+1, widx);
fidx = fidx + 1;
}
return fidx;
}

static PrintFunctions(hfunctionnames, inumberoffunctions)
{
auto fidx;
for(fidx = 0; fidx < inumberoffunctions; fidx = fidx + 1)
{
Message("Function: %s\n", GetArrayElement(AR_STR, hfunctionnames, 2*fidx));
}
}

static WriteBackFunctions(hfunctionnames, inumberoffunctions, iwriteaddr)
{
auto fidx, oidx, funcname, hopcodes, opcodeslen;

for(fidx = 2; fidx >=0 ; fidx = fidx - 1)
{
funcname = GetArrayElement(AR_STR, hfunctionnames, 2*fidx);
opcodeslen = GetArrayElement(AR_LONG, hfunctionnames, 2*fidx+1);
hopcodes = GetArrayId(funcname);
for(oidx = 0; oidx < opcodeslen; oidx = oidx + 1)
{
PatchDword(iwriteaddr, GetArrayElement(AR_LONG, hopcodes, oidx));
iwriteaddr = iwriteaddr + 4;
}
}
}

static main()
{
auto inumberoffunctions, hfunctionnames;

// This array is populated with names of functions and
// the length of the functions in dwords in the following
// way [name1,length1,name2,length2,...]
hfunctionnames = CreateArray("FunctionNames";

if(hfunctionnames == -1)
{
// If array already exist get the handle by GetArrayId
Message("hfunctionnames is -1.\n";
hfunctionnames = GetArrayId("FunctionNames";
}

// Enumerate functions and store them i persistent array
inumberoffunctions = EnumerateAndStoreFunctions(hfunctionnames);

// Print functions in IDA's output window
PrintFunctions(hfunctionnames, inumberoffunctions);

// Write Back functions in reversed order
WriteBackFunctions(hfunctionnames, inumberoffunctions, 0x401000);

}


Watermark1.exe original

Code:
.text:00401000 ; =============== S U B R O U T I N E =======================================
.text:00401000
.text:00401000
.text:00401000 ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:00401000 _main proc near ; CODE XREF: ___tmainCRTStartup+10Ap
.text:00401000 call sub_401010
.text:00401005 call sub_401020
.text:0040100A xor eax, eax
.text:0040100C retn
.text:0040100C _main endp
.text:0040100C
.text:0040100C ; ---------------------------------------------------------------------------
.text:0040100D align 10h
.text:00401010
.text:00401010 ; =============== S U B R O U T I N E =======================================
.text:00401010
.text:00401010
.text:00401010 sub_401010 proc near ; CODE XREF: _mainp
.text:00401010 push offset Format ; "Hello from object 1!\n"
.text:00401015 call dsrintf
.text:0040101B pop ecx
.text:0040101C retn
.text:0040101C sub_401010 endp
.text:0040101C
.text:0040101C ; ---------------------------------------------------------------------------
.text:0040101D align 10h
.text:00401020
.text:00401020 ; =============== S U B R O U T I N E =======================================
.text:00401020
.text:00401020
.text:00401020 sub_401020 proc near ; CODE XREF: _main+5p
.text:00401020 push offset aHelloFromObj_0 ; "Hello from object 2!\n"
.text:00401025 call dsrintf
.text:0040102B pop ecx
.text:0040102C retn
.text:0040102C sub_401020 endp


Watermark1.exe modified with script

Code:
.text:00401000 ; =============== S U B R O U T I N E =======================================
.text:00401000
.text:00401000
.text:00401000 ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:00401000 _main proc near ; CODE XREF: ___tmainCRTStartup+10Ap
.text:00401000 push 4020E0h
.text:00401005 call dsrintf
.text:0040100A add [ecx-3Dh], bl
.text:0040100C retn
.text:0040100C _main endp
.text:0040100C
.text:0040100C ; ---------------------------------------------------------------------------
.text:0040100D align 10h
.text:00401010
.text:00401010 ; =============== S U B R O U T I N E =======================================
.text:00401010
.text:00401010
.text:00401010 sub_401010 proc near ; CODE XREF: _mainp
.text:00401010 push offset Format ; "Hello from object 1!\n"
.text:00401015 call dsrintf
.text:0040101B pop ecx
.text:0040101C retn
.text:0040101C sub_401010 endp
.text:0040101C
.text:0040101C ; ---------------------------------------------------------------------------
.text:0040101D align 10h
.text:00401020
.text:00401020 ; =============== S U B R O U T I N E =======================================
.text:00401020
.text:00401020
.text:00401020 sub_401020 proc near ; CODE XREF: _main+5p
.text:00401020 call near ptr unk_4020A8-1078h ; "Hello from object 2!\n"
.text:00401025 call near ptr loc_40103C+4
.text:0040102B rol bl, 0CCh
.text:0040102C retn
.text:0040102C sub_401020 endp


I have double-checked things in Hex-view
Code:
Before (start 0x401000)
E8 0B 00 00 00 E8 16 00 00 00 33 C0 C3 CC CC CC
68 C8 20 40 00 FF 15 A0 20 40 00 59 C3 CC CC CC
68 E0 20 40 00 FF 15 A0 20 40 00 59 C3 68 12 14
After (start 0x401000)
68 E0 20 40 00 FF 15 A0 20 40 00 59 C3 68 12 14
68 C8 20 40 00 FF 15 A0 20 40 00 59 C3 CC CC CC
E8 0B 00 00 00 E8 16 00 00 00 33 C0 C3 CC CC CC


I don't understand the details of why for instance
Code:
xor eax, eax

in main becomes
Code:
rol bl, 0CCh

only that the addresses must be updated correspondingly in order to correct it. And this I think will be more difficult. I haven't yet thought about how to fix the addresses. One way maybe is to have a pre-processing stage where the addresses that need to be updated after the reordering are labeled and a post-processing stage after the actual reordering where the addresses are fixed...have to think some more about this

Kayaker
December 16th, 2010, 20:06

Hi niaren,



Nice start.  If you edit the code it's always a good idea to get IDA to reanalyze.  It will fix some things and point out errors in other sections.  Try inserting something like the following at the end of main()





    auto text_start, text_end, size;    



    text_start = SegByBase(1);

    text_end = SegEnd(text_start);

    size = text_end - text_start;    



    Message(&quot;text_start %x \n&quot;, text_start);

    Message(&quot;text_end %x \n&quot;, text_end);

    Message(&quot;size %x \n&quot;, size);



    MakeUnknown (text_start, size, 1);

    AnalyzeArea (text_start, text_end+1);    





or if you prefer to include the entire file it's even simpler to write just:





    MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);

    AnalyzeArea (MinEA(), MaxEA());    







You can test this manually as well. After applying your existing script, undefine('U') the affected sections and reanalyze with 'C'.  You'll see that the rol bl, 0CCh is fixed back to xor eax, eax, but you'll also see that the middle proc was actually affected negatively, which you don't see if you don't reanalyze.



Cheers,

Kayaker


dELTA
December 16th, 2010, 22:06
Quote:
[Originally Posted by Kayaker;88594]Thanks for clarifying watermarking dELTA. I understood that it was to match a particular compilation to a particular person (so they might get the crap sued out of them as you say), but I was also envisioning it as being used as part of a "normal" protection scheme as well, which I guess doesn't necessarily have to be the case and obviously not part of this project.
Sure, it could of course be done, but it would be extremely stupid to reveal the watermark locations explicitly in the program's own code. A normal CRC will work just as well in that aspect, and be just as hard (easy) to patch out. You do of course understand this already, I'm just writing it here for reference.


Quote:
[Originally Posted by Kayaker;88594]If that's not the case, then what's the benefit of removing such a watermark? If I've got IDA and I'm able to steal YOUR IDA, then I can swap watermarks and YOU get blamed for the release, is that it?
The benefit is that everyone will have their own copy of IDA for every new release, when people aren't afraid of leaking a cracked version of their own copy anymore, including you, when you (or whatever friend you're leeching it off :devil get tired of paying the yearly fee.

Jokes aside (and before "someone" gets unnecessarily pissed on us ), this thread and project is not about warezing IDA. Rather, it's about the theoretical challenge of defeating a more or less powerful "protection technique", for the pure hell (and learning experience) of it, just like all other discussions on this board. The IDA watermarks are one of the most highly held (and foremost, practically efficient!) protection systems out there today, so of course it's fun to try to break it!


Quote:
[Originally Posted by Kayaker;88594]This reminded me of a paper I had posted before
http://www.woodmann.com/forum/showthread.php?9483-Article-Software-Security-Through-Targetted-Diversification

...

The paper is a thesis which discusses the idea of creating software which is distributed as polymorphised versions, in an effort to discourage automated or generic cracking of it. Specifically it suggests the use of Genetic Algorithm (GA) programming to create a diverse population of software for distribution to the masses.


This suggests GA could also be used to create individualised programs. Change a few parameters, fitness/crossover values, record some unique aspect of the offspring (compiled program), and give it to its adopted parent (registered owner). If you find it outside of its new home (leaked), do a DNA analysis.
(rant start) Just for the record, I think the inclusion of "Genetic Algorithms" in that paper is just a stupid excuse to include some buzz words, and I don't at all see the practical use for it. The primary use of Genetic Algorithms is to find (semi)optimal solutions to massively multidimensional problems, while the efficient polymorphing of code in order to effectively hide information is absolutely not that kind of problem. All it will result in is less efficient and less systematic information hiding, and much more easily corruptable watermarks I think. It is very much like "artificial intelligence", which people also often try to use on completely incompatible and inoptimal problems, just because it has a "cool ring to it". (rant stop)


Quote:
[Originally Posted by niaren;88595]Kayaker, did you create those MAP files in IDA? (File->Produce File->Create MAP file...) I didn't think of creating MAP files, maybe because the files are so simple. Thanks for the tip
I suspect he simply let the linker produce them, which would be much more efficient for use as "reference material" in a case like this. You will find options for it in your linker.


Quote:
[Originally Posted by niaren;88595]I have double-checked things in Hex-view
Code:
Before (start 0x401000)
E8 0B 00 00 00 E8 16 00 00 00 33 C0 C3 CC CC CC
68 C8 20 40 00 FF 15 A0 20 40 00 59 C3 CC CC CC
68 E0 20 40 00 FF 15 A0 20 40 00 59 C3 68 12 14
After (start 0x401000)
68 E0 20 40 00 FF 15 A0 20 40 00 59 C3 68 12 14
68 C8 20 40 00 FF 15 A0 20 40 00 59 C3 CC CC CC
E8 0B 00 00 00 E8 16 00 00 00 33 C0 C3 CC CC CC
The optimal visualization method for your results would probably be to configure your IDA to show full opcode bytes directly in the disassembly listing. Then you would not need complementary hex dumps like this, and the somewhat confusing coinciding relative offset collisions of the string pointers in your disassembly listings above would also be much more easily explained too.


Quote:
[Originally Posted by Kayaker;88597]Nice start. If you edit the code it's always a good idea to get IDA to reanalyze.

...

You can test this manually as well. After applying your existing script, undefine('U') the affected sections and reanalyze with 'C'. You'll see that the rol bl, 0CCh is fixed back to xor eax, eax, but you'll also see that the middle proc was actually affected negatively, which you don't see if you don't reanalyze.
When it comes to massive code permutations like this, I would never trust the results of a mere reanalysis of the live listing inside IDA. Rather, I would let the IDC script patch the raw mutated bytes right into a copy of the executable on disk, and load that one up in IDA individually. Otherwise, my guess is that you'll sooner or later be in a world of unnecessary pain and confusion.


Quote:
[Originally Posted by niaren;88595]...only that the addresses must be updated correspondingly in order to correct it. And this I think will be more difficult. I haven't yet thought about how to fix the addresses. One way maybe is to have a pre-processing stage where the addresses that need to be updated after the reordering are labeled and a post-processing stage after the actual reordering where the addresses are fixed
Yes, you should definitely identify and keep track of all offsets and addresses in the code before starting to shuffle it around, and then adjust all these accordingly after haven chosen a new location for the function in question. I strongly advice you to make use of IDAs powerful analysis and metadata information of the code for this purpose, since it has already done most of the hard work for you in this regard, i.e. identifying all offsets, addresses and other constructs relevant for such an operation!

Finally, very nice start niaren, keep up the good work, it will be much interesting to follow this!

niaren
December 19th, 2010, 17:10

Thanks for all the feedback. It's a real pleasure 



Kayaker, I have tried to insert 



MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);

AnalyzeArea (MinEA(), MaxEA());  



It does not really work, can't figure out why. I have to manually press 'U' 'C' as you said in order to get the disassembly to look right. This is of course unfortunate if we depend on IDA showing the correct disassembly. However, the approach used now in the script does not depend on IDA showing the correct disassembly, only initially.



And yes you were absolutely right, there was a bug in the script  

The reason why I asked about the MAP files is because I had not foreseen you would actually build the files yourself 



The script seems to work now including patching the call instructions. The script works by 



- creating an address translation lookup table [I made up that name myself, don't know what else to call it ]

- Patch the instructions in-place (those that need to be updated)

- Finally do the reordering



The address translation LUT takes an RVA as input and returns the RVA in the reordered image. For watermark1.exe the LUT looks like this:



<div style="margin:20px; margin-top:5px"><div class="smallfont" style="margin-bottom:2px">Code:</div><pre class="alt2" style="margin:0px; padding:6px; border:solid 1px; width:90%; height:80px; overflow:auto"><div dir="ltr" style="text-align:left;">

Address 401000 mapped to 40101d

Address 401005 mapped to 401022

Address 40100a mapped to 401027

Address 40100c mapped to 401029

Address 401010 mapped to 40100d

Address 401015 mapped to 401012

Address 40101b mapped to 401018

Address 40101c mapped to 401019

Address 401020 mapped to 401000

Address 401025 mapped to 401005

Address 40102b mapped to 40100b

Address 40102c mapped to 40100c

</div></pre></div>



The PatchInPlaceDebug function prints the LUT.



This is the script



#include &lt;idc.idc&gt; // Mandatory include directive



static GetNumberOfFunctions()

{

    auto addr, name, fidx;

    addr = 0;

    fidx = 0; // function index

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return fidx;

        }

        fidx = fidx + 1;        

    }

    return fidx;

}



static CreatePermutation(inumberoffunctions)

{

    auto hpermutation;

    

    hpermutation = CreateArray(&quot;Permutation&quot;

    if(hpermutation == -1)

    {

        // If array already exist get the handle by GetArrayId

        hpermutation = GetArrayId(&quot;Permutation&quot;

    }

    // Hardcoded permutation

    SetArrayLong(hpermutation, 0, 2);

    SetArrayLong(hpermutation, 1, 1);

    SetArrayLong(hpermutation, 2, 0);

    return hpermutation;

}

    

static GetFunctionAddresses()

{

    auto addr, name, fidx, hfunctionaddresses;

    addr = 0;

    fidx = 0; // function index

    

    hfunctionaddresses = CreateArray(&quot;FunctionAddresses&quot;

    if(hfunctionaddresses == -1)

    {

        // If array already exist get the handle by GetArrayId

        hfunctionaddresses = GetArrayId(&quot;FunctionAddresses&quot;

    }



    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return hfunctionaddresses;

        }

        SetArrayLong(hfunctionaddresses, 2*fidx, addr);

        SetArrayLong(hfunctionaddresses, 2*fidx+1, NextFunction(addr) - addr);

        

        fidx = fidx + 1;

    }

    return hfunctionaddresses;

}



static GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions)

{

    auto addr, pidx, fidx, hnewfunctionaddresses;

    addr = 0;

    pidx = 0;

    fidx = 0;

    

    hnewfunctionaddresses = CreateArray(&quot;NewFunctionAddresses&quot;

    if(hnewfunctionaddresses == -1)

    {

        // If array already exist get the handle by GetArrayId

        hnewfunctionaddresses = GetArrayId(&quot;NewFunctionAddresses&quot;

    }

    

    // Address of first function

    addr = NextFunction(addr);

    

    fidx = GetArrayElement(AR_LONG, hpermutation, pidx); 

    SetArrayLong(hnewfunctionaddresses, fidx, addr);

    addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);

    

    for(pidx=1; pidx &lt; inumberoffunctions; pidx++)

    {

        fidx = GetArrayElement(AR_LONG, hpermutation, pidx); 

        SetArrayLong(hnewfunctionaddresses, fidx, addr);

        addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);

    }

    return hnewfunctionaddresses;

}



static CreateAddressTranslationLUT(hnewfunctionaddresses)

{

    auto addr, haddresstranslationlut, name, end, inst, newaddr, fidx;

    addr = 0;

    fidx = 0;

    

    haddresstranslationlut = CreateArray(&quot;AddressTranslationLookupTable&quot;

    if(haddresstranslationlut == -1)

    {

        // If array already exist get the handle by GetArrayId

        haddresstranslationlut = GetArrayId(&quot;AddressTranslationLookupTable&quot;

    }

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return haddresstranslationlut;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        // Get new base address of function

        newaddr = GetArrayElement(AR_LONG, hnewfunctionaddresses, fidx);

        

        SetArrayLong(haddresstranslationlut, inst, newaddr);

        Message(&quot;haddresstranslationlut %x \n&quot;,haddresstranslationlut);

        inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        while(inst &lt; end)

        {

            SetArrayLong(haddresstranslationlut, inst, newaddr + (inst-addr));

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        }

        fidx = fidx + 1;

    }

    return haddresstranslationlut;

}



static PatchInPlaceDebug(haddresstranslationlut)

{

    auto addr, name, end, inst, newaddr;

    addr = 0;

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        while(inst &lt; end)

        {

            Message(&quot;Address %x mapped to %x\n&quot;,inst,GetArrayElement(AR_LONG, haddresstranslationlut, inst));

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        }

    }

}



static PatchInPlace(haddresstranslationlut)

{

    auto addr, name, end, inst, newaddr, opidx, optype, newrva, nearaddr;

    addr = 0;

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        while(inst &lt; end)

        {

            opidx = 0;

            optype = GetOpType(inst,opidx);

            while(optype &gt; 0)

            {

                if(optype == 7)

                {

                    // Immediate Near Address

                    

                    // Maybe not necessary but check for call instruction

                    if(GetMnem(inst) == &quot;call&quot

                    {

                        Message(&quot;Instruction at %x being patched.\n&quot;, inst);

                        nearaddr = LocByName(GetOpnd(inst, opidx));

                        newrva   = GetArrayElement(AR_LONG, haddresstranslationlut, nearaddr) - (GetArrayElement(AR_LONG, haddresstranslationlut, inst)+0x6);

                        PatchDword(inst+0x1, newrva+0x1);

                        if(nearaddr == BADADDR)

                        {

                            Message(&quot;Fatal error, error processing instruction at %x\n&quot;, inst);

                        }

                    }

                    else

                    {

                        Message(&quot;Unsupported! Unknown %s instruction needs to be patched.\n&quot;, GetMnem(inst));

                    }

                        

                }

                

                opidx++;

                optype = GetOpType(inst,opidx);

            }  

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);        }

    }

}



static EnumerateAndStoreFunctions(hfunctionnames)

{

    auto addr, tmpaddr, name, fidx, widx, bsuccess, tmphandle, inextfunction;

    addr = 0;

    fidx = 0; // function index

    widx = 0; // word idx

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return fidx;

        }

        

        bsuccess = SetArrayString(hfunctionnames, 2*fidx, name);

        if(bsuccess == 0)

        {

            Message(&quot;Saving name of function %s failed.&quot;,name); 

        }

    

        tmphandle = CreateArray(name);

        if(tmphandle == -1)

        {

            tmphandle = GetArrayId(name);

        }



        inextfunction = NextFunction(addr);

        if(inextfunction == BADADDR)

        {

            inextfunction = GetFunctionAttr(addr, FUNCATTR_END);

        }

        

        widx = 0;

        for(tmpaddr = addr; tmpaddr &lt; inextfunction; tmpaddr = tmpaddr + 1)

        {

             SetArrayLong(tmphandle, widx, Byte (tmpaddr));

             widx = widx + 1;

        }

        bsuccess = SetArrayLong(hfunctionnames, 2*fidx+1, widx);

        fidx = fidx + 1;        

    }

    return fidx;

}



static PrintFunctions(hfunctionnames, inumberoffunctions)

{

    auto fidx;

    for(fidx = 0; fidx &lt; inumberoffunctions; fidx = fidx + 1)

    {

        Message(&quot;Function: %s\n&quot;, GetArrayElement(AR_STR, hfunctionnames, 2*fidx));

    }

}



static WriteBackFunctions(hfunctionnames, inumberoffunctions, iwriteaddr)

{

    auto fidx, oidx, funcname, hopcodes, opcodeslen;



    for(fidx = 2; fidx &gt;=0 ; fidx = fidx - 1)

    {

        funcname    = GetArrayElement(AR_STR, hfunctionnames, 2*fidx); 

        opcodeslen  = GetArrayElement(AR_LONG, hfunctionnames, 2*fidx+1); 

        hopcodes    = GetArrayId(funcname);

        for(oidx = 0; oidx &lt; opcodeslen; oidx = oidx + 1)

        {

            PatchByte(iwriteaddr, GetArrayElement(AR_LONG, hopcodes, oidx));

            iwriteaddr = iwriteaddr + 1;

        }

    }

}



static main()

{

    auto didx, inumberoffunctions, hfunctionnames, hpermutation, hfunctionaddresses, hnewfunctionaddresses;

    auto haddresstranslationlut;

    

    // Get number of functions 

    inumberoffunctions = GetNumberOfFunctions();

    

    // DEBUG

    // Message(&quot;Number of functions %d\n&quot;,inumberoffunctions);



    // Create permutation array

    hpermutation = CreatePermutation(inumberoffunctions);

    

    // Get current function addresses

    hfunctionaddresses = GetFunctionAddresses();

    

    // Get addresses after permutation

    hnewfunctionaddresses = GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions);    



    // Pre-processing, create address translation lookup table

    haddresstranslationlut = CreateAddressTranslationLUT(hnewfunctionaddresses);

    

    PatchInPlace(haddresstranslationlut);

    

    //DEBUG  

    //for(didx = 0; didx&lt;inumberoffunctions; didx++)

    //{

    //    Message(&quot;New Function address: %x\n&quot;, GetArrayElement(AR_LONG, hnewfunctionaddresses, didx));

    //}

    //return;

    

    // This array is populated with names of functions and

    // the length of the functions in dwords in the following

    // way [name1,length1,name2,length2,...]

    hfunctionnames = CreateArray(&quot;FunctionNames&quot;

     

    if(hfunctionnames == -1)

    {

        // If array already exist get the handle by GetArrayId

        Message(&quot;hfunctionnames is -1.\n&quot;

        hfunctionnames = GetArrayId(&quot;FunctionNames&quot;

    }

 

    //  Enumerate functions and store them i persistent array

    inumberoffunctions = EnumerateAndStoreFunctions(hfunctionnames);    



    // Print functions in IDA's output window

    PrintFunctions(hfunctionnames, inumberoffunctions);

    

    // Write Back functions in reversed order

    WriteBackFunctions(hfunctionnames, inumberoffunctions, 0x401000);

 

    MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);

    AnalyzeArea (MinEA(), MaxEA());      

}





Watermark1.exe before



<div style="margin:20px; margin-top:5px"><div class="smallfont" style="margin-bottom:2px">Code:</div><pre class="alt2" style="margin:0px; padding:6px; border:solid 1px; width:90%; height:80px; overflow:auto"><div dir="ltr" style="text-align:left;">

.text:00401000 ; int __cdecl main(int argc, const char **argv, const char **envp)

.text:00401000 _main           proc near               ; CODE XREF: ___tmainCRTStartup+10Ap

.text:00401000                 call    sub_401010

.text:00401005                 call    sub_401020

.text:0040100A                 xor     eax, eax

.text:0040100C                 retn

.text:0040100C _main           endp

.text:0040100C

.text:0040100C ; ---------------------------------------------------------------------------

.text:0040100D                 align 10h

.text:00401010

.text:00401010 ; =============== S U B R O U T I N E =======================================

.text:00401010

.text:00401010

.text:00401010 sub_401010      proc near               ; CODE XREF: _mainp

.text:00401010                 push    offset Format   ; &quot;Hello from object 1!\n&quot;

.text:00401015                 call    dsrintf

.text:0040101B                 pop     ecx

.text:0040101C                 retn

.text:0040101C sub_401010      endp

.text:0040101C

.text:0040101C ; ---------------------------------------------------------------------------

.text:0040101D                 align 10h

.text:00401020

.text:00401020 ; =============== S U B R O U T I N E =======================================

.text:00401020

.text:00401020

.text:00401020 sub_401020      proc near               ; CODE XREF: _main+5p

.text:00401020                 push    offset aHelloFromObj_0 ; &quot;Hello from object 2!\n&quot;

.text:00401025                 call    dsrintf

.text:0040102B                 pop     ecx

.text:0040102C                 retn

.text:0040102C sub_401020      endp

</div></pre></div>



watermark1.exe after (had to press 'U' 'C' on the last routine)



<div style="margin:20px; margin-top:5px"><div class="smallfont" style="margin-bottom:2px">Code:</div><pre class="alt2" style="margin:0px; padding:6px; border:solid 1px; width:90%; height:80px; overflow:auto"><div dir="ltr" style="text-align:left;">

.text:00401000 _main           proc near               ; CODE XREF: .text:00401022p

.text:00401000                                         ; .text:00401182p

.text:00401000                 push    offset aHelloFromObjec ; &quot;Hello from object 2!\n&quot;

.text:00401005                 call    dsrintf

.text:0040100B                 pop     ecx

.text:0040100C                 retn

.text:0040100C _main           endp

.text:0040100C

.text:0040100D

.text:0040100D ; =============== S U B R O U T I N E =======================================

.text:0040100D

.text:0040100D

.text:0040100D sub_40100D      proc near               ; CODE XREF: .text:0040101Dp

.text:0040100D                 push    offset aHelloFromObj_0 ; &quot;Hello from object 1!\n&quot;

.text:00401012                 call    dsrintf

.text:00401018                 pop     ecx

.text:00401019                 retn

.text:00401019 sub_40100D      endp

.text:00401019

.text:00401019 ; ---------------------------------------------------------------------------

.text:0040101A                 db 3 dup(0CCh)

.text:0040101D ; ---------------------------------------------------------------------------

.text:0040101D                 call    sub_40100D

.text:00401022                 call    _main

.text:00401027                 xor     eax, eax

.text:00401029                 retn

</div></pre></div>



The acid test must be to get a working exe. I was a little surprised to learn that File-&gt;Produce File-&gt;Create EXE file... shows an 'Unsupported' messagebox. The Entry Point also needs to be changed. 



I have also tested the script on watermark2.exe and it also seem to work there after forcing IDA to show the correct disassembly with 'U' 'C'.



I will do some searching afterwards to see how I can get the changes made in IDA down to a file on disk as you mentioned dELTA.

Another thing that I'm seriously considering is to switch to IDAPython. The IDC script is a little messy, there is little code reuse and I can hardly find way through the code myself. I hope all this will change going to Python. This mini-project was also about learning IDC but I think there is IDC code enough now. 

Are you ready for Python? 


Kayaker
December 19th, 2010, 20:33

No I just used the IDA feature to produce the MAP file.  Rumours to the contrary are unfounded 

I mean, we couldn't relink to create a map file in real life, so one could hardly &quot;cheat&quot; in this mini project right? 





I see what you mean about AnalyzeArea not working very well.  I tried adding a second instance after the first with



...

    Wait();  // Wait for the end of autoanalysis

    AnalyzeArea (MinEA(), MaxEA());



The second pass produced further changes, but still didn't get everything correct.  As dELTA alluded to, I guess it's not perfect and a full redisassembly of a patched file would probably produce better results.





However, if you're interested in what else you can do to produce good (re)disassembly results using a script, you might want to look at the source of the very effective IDA_ExtraPass_PlugIn by Sirmabus.  It handles things like 'align' blocks, stray blocks of code, undefined functions and such.



http://www.woodmann.com/collaborative/tools/ExtraPass    





If interested, you might also like to look at the IDC scripts I wrote for analyzing a malware:



http://www.woodmann.com/forum/entry.php?35-IDC-scripting-a-Win32.Virut-variant-Part-1



I ended up doing several &quot;clean-up&quot; passes to make a readable disassembly.  They are in 4 separate idc scripts just for clarity. The first was a standard AnalyzeArea reanalysis after doing some decrypting. The next step was a manual fix-up of embedded string pointers (I couldn't think of a &quot;smart&quot; script to handle that automatically).



Then came a script to convert operands of the form &quot;[ebp+xxxxxxh]&quot; to a real offset, another one to clean up unwanted operand prefix/suffix text the disassembly produced, and one more to resolve API addresses.  Finally we read in a C header file containing some undefined function prototypes and structures.  If you read the full blog post, I mention a few more details about some useful idc commands and a few quirks I found while working with reanalysing a disassembly.



It just goes to show that there are a fair number of things you can do to produce a &quot;nice&quot; looking and accurate disassembly using IDC/plw scripts.





Python? Are you a masochist? 


niaren
December 20th, 2010, 17:43

<div style="margin:20px; margin-top:5px; "><div class="smallfont" style="margin-bottom:2px">Quote:</div><table cellpadding="6" cellspacing="0" border="1" width="90%"><tr><td class="alt2" style="border:1px inset"><i>[Originally Posted by Kayaker;88637]

Python? Are you a masochist? </i></td></tr></table></div>

Hehe, had a good laugh 



I will take a look at all the goodies you referenced in order to get the disassembly right. However, I can't wait to test the script on a real exe so I have prioritized that for today. And I think I'm almost there now. Made a quick test on watermark1.exe and it looked alright. Just need to change the entry point and do some testing 





#include &lt;idc.idc&gt; // Mandatory include directive



static GetFileHandle(mode)

{

    auto hFile;

    

    hFile = fopen(GetInputFilePath(), mode);

    if (0 == hFile)

    {

        Message(&quot;Cannot open \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }

    return hFile;

}



static GetPointerToPEHeader(hfile)

{

    auto e_lfanew;

    

    // Seek to the e_lfanew field 

    if (0 != fseek(hfile, 0x3C, 0))

    {

        Message(&quot; 1 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;, handle: %x&quot;, hfile);

    }



    // Read the value of e_lfanew

    e_lfanew = readlong(hfile, 0);



    // Seek to IMAGE_NT_HEADERS

    if (0 != fseek(hfile, e_lfanew, 0))

    {

        Message(&quot; 2 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;, handle: %x, elfanew: %x\n&quot;, hfile, e_lfanew);

    }



    // Read the Signature

    if (0x00004550 != readlong(hfile, 0))

    {

        Message(&quot;Not a valid PE file&quot;

    }

    return e_lfanew;

}



static GetImageBase(hfile, e_lfanew)

{

    auto imageBase;

    

    // Seek to the IMAGE_NT_HEADERS.OptionalHeader.ImageBase field

    if (0 != fseek(hfile, e_lfanew + 0x18 + 0x1C, 0))

    {

        Fatal(&quot; 3 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }

    imageBase = readlong(hfile, 0);

    return imageBase;

}



static GetVirtualSectionOffset(hfile, e_lfanew, section)

{

    auto numberOfSections, sectionRva;

    

    // Seek to the IMAGE_FILE_HEADER.NumberOfSections field

    if (0 != fseek(hfile, e_lfanew + 0x06, 0))

    {

        Fatal(&quot; 4 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }



    // Read the number of sections

    numberOfSections = readshort(hfile, 0);

    

    if (section &gt;= numberOfSections)

    {

        Fatal(&quot;Invalid section&quot;

    }



    // Seek to the desired section

    if (0 != fseek(hfile, e_lfanew + 0xF8 + section * 0x28 + 0x0C, 0))

    {

        Fatal(&quot; 5 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }



    sectionRva = readlong(hfile, 0);

    return sectionRva;

}



static GetRawSectionOffset(hfile, e_lfanew, section)

{

    auto pointerToRawData;

    

    // Seek to the desired section

    if (0 != fseek(hfile, e_lfanew + 0xF8 + section * 0x28 + 0x14, 0))

    {

        Fatal(&quot; 6 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }



    pointerToRawData = readlong(hfile, 0);

    return pointerToRawData;

}



static GetFileOffset(rva, imagebase, virtualsectionoffset, rawsectionoffset)

{

    return rva - imagebase - virtualsectionoffset + rawsectionoffset;

}



static GetNumberOfFunctions()

{

    auto addr, name, fidx;

    addr = 0;

    fidx = 0; // function index

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return fidx;

        }

        fidx = fidx + 1;        

    }

    return fidx;

}



static CreatePermutation(inumberoffunctions)

{

    auto hpermutation;

    

    hpermutation = CreateArray(&quot;Permutation&quot;

    if(hpermutation == -1)

    {

        // If array already exist get the handle by GetArrayId

        hpermutation = GetArrayId(&quot;Permutation&quot;

    }

    // Hardcoded permutation

    SetArrayLong(hpermutation, 0, 2);

    SetArrayLong(hpermutation, 1, 1);

    SetArrayLong(hpermutation, 2, 0);

    return hpermutation;

}

    

static GetFunctionAddresses()

{

    auto addr, name, fidx, hfunctionaddresses;

    addr = 0;

    fidx = 0; // function index

    

    hfunctionaddresses = CreateArray(&quot;FunctionAddresses&quot;

    if(hfunctionaddresses == -1)

    {

        // If array already exist get the handle by GetArrayId

        hfunctionaddresses = GetArrayId(&quot;FunctionAddresses&quot;

    }



    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return hfunctionaddresses;

        }

        SetArrayLong(hfunctionaddresses, 2*fidx, addr);

        SetArrayLong(hfunctionaddresses, 2*fidx+1, NextFunction(addr) - addr);

        

        fidx = fidx + 1;

    }

    return hfunctionaddresses;

}



static GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions)

{

    auto addr, pidx, fidx, hnewfunctionaddresses;

    addr = 0;

    pidx = 0;

    fidx = 0;

    

    hnewfunctionaddresses = CreateArray(&quot;NewFunctionAddresses&quot;

    if(hnewfunctionaddresses == -1)

    {

        // If array already exist get the handle by GetArrayId

        hnewfunctionaddresses = GetArrayId(&quot;NewFunctionAddresses&quot;

    }

    

    // Address of first function

    addr = NextFunction(addr);

    

    fidx = GetArrayElement(AR_LONG, hpermutation, pidx); 

    SetArrayLong(hnewfunctionaddresses, fidx, addr);

    addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);

    

    for(pidx=1; pidx &lt; inumberoffunctions; pidx++)

    {

        fidx = GetArrayElement(AR_LONG, hpermutation, pidx); 

        SetArrayLong(hnewfunctionaddresses, fidx, addr);

        addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);

    }

    return hnewfunctionaddresses;

}



static CreateAddressTranslationLUT(hnewfunctionaddresses)

{

    auto addr, haddresstranslationlut, name, end, inst, newaddr, fidx;

    addr = 0;

    fidx = 0;

    

    haddresstranslationlut = CreateArray(&quot;AddressTranslationLookupTable&quot;

    if(haddresstranslationlut == -1)

    {

        // If array already exist get the handle by GetArrayId

        haddresstranslationlut = GetArrayId(&quot;AddressTranslationLookupTable&quot;

    }

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return haddresstranslationlut;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        // Get new base address of function

        newaddr = GetArrayElement(AR_LONG, hnewfunctionaddresses, fidx);

        

        SetArrayLong(haddresstranslationlut, inst, newaddr);

        Message(&quot;haddresstranslationlut %x \n&quot;,haddresstranslationlut);

        inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        while(inst &lt; end)

        {

            SetArrayLong(haddresstranslationlut, inst, newaddr + (inst-addr));

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        }

        fidx = fidx + 1;

    }

    return haddresstranslationlut;

}



static PatchInPlaceDebug(haddresstranslationlut)

{

    auto addr, name, end, inst, newaddr;

    addr = 0;

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        while(inst &lt; end)

        {

            Message(&quot;Address %x mapped to %x\n&quot;,inst,GetArrayElement(AR_LONG, haddresstranslationlut, inst));

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        }

    }

}



static PatchInPlace(haddresstranslationlut)

{

    auto addr, name, end, inst, newaddr, opidx, optype, newrva, nearaddr;

    addr = 0;

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        while(inst &lt; end)

        {

            opidx = 0;

            optype = GetOpType(inst,opidx);

            while(optype &gt; 0)

            {

                if(optype == 7)

                {

                    // Immediate Near Address

                    

                    // Maybe not necessary but check for call instruction

                    if(GetMnem(inst) == &quot;call&quot

                    {

                        Message(&quot;Instruction at %x being patched.\n&quot;, inst);

                        nearaddr = LocByName(GetOpnd(inst, opidx));

                        newrva   = GetArrayElement(AR_LONG, haddresstranslationlut, nearaddr) - (GetArrayElement(AR_LONG, haddresstranslationlut, inst)+0x6);

                        PatchDword(inst+0x1, newrva+0x1);

                        if(nearaddr == BADADDR)

                        {

                            Message(&quot;Fatal error, error processing instruction at %x\n&quot;, inst);

                        }

                    }

                    else

                    {

                        Message(&quot;Unsupported! Unknown %s instruction needs to be patched.\n&quot;, GetMnem(inst));

                    }

                        

                }

                

                opidx++;

                optype = GetOpType(inst,opidx);

            }  

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);        }

    }

}



static EnumerateAndStoreFunctions(hfunctionnames)

{

    auto addr, tmpaddr, name, fidx, widx, bsuccess, tmphandle, inextfunction;

    addr = 0;

    fidx = 0; // function index

    widx = 0; // word idx

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return fidx;

        }

        

        bsuccess = SetArrayString(hfunctionnames, 2*fidx, name);

        if(bsuccess == 0)

        {

            Message(&quot;Saving name of function %s failed.&quot;,name); 

        }

    

        tmphandle = CreateArray(name);

        if(tmphandle == -1)

        {

            tmphandle = GetArrayId(name);

        }



        inextfunction = NextFunction(addr);

        if(inextfunction == BADADDR)

        {

            inextfunction = GetFunctionAttr(addr, FUNCATTR_END);

        }

        

        widx = 0;

        for(tmpaddr = addr; tmpaddr &lt; inextfunction; tmpaddr = tmpaddr + 1)

        {

             SetArrayLong(tmphandle, widx, Byte (tmpaddr));

             widx = widx + 1;

        }

        bsuccess = SetArrayLong(hfunctionnames, 2*fidx+1, widx);

        fidx = fidx + 1;        

    }

    return fidx;

}



static PrintFunctions(hfunctionnames, inumberoffunctions)

{

    auto fidx;

    for(fidx = 0; fidx &lt; inumberoffunctions; fidx = fidx + 1)

    {

        Message(&quot;Function: %s\n&quot;, GetArrayElement(AR_STR, hfunctionnames, 2*fidx));

    }

}



static WriteBackFunctions(hfunctionnames, inumberoffunctions, iwriteaddr, writetofile, hfile)

{

    auto fidx, oidx, funcname, hopcodes, opcodeslen;

    auto imagebase, virtualsectionoffset, rawsectionoffset;

    auto writeerror, byte, hglobalvars, fileoffset;

    

    if(writetofile == 1)

    {

        hglobalvars          = GetArrayId(&quot;GlobalVars&quot;

        imagebase            = GetArrayElement(AR_LONG, hglobalvars, 0);

        virtualsectionoffset = GetArrayElement(AR_LONG, hglobalvars, 1);

        rawsectionoffset     = GetArrayElement(AR_LONG, hglobalvars, 2);

    }

    

    // DEBUG

    Message(&quot;imagebase: %x, virtualsectionoffset: %x, rawsectionoffset: %x\n&quot;,imagebase,virtualsectionoffset,rawsectionoffset);

    

    for(fidx = 2; fidx &gt;=0 ; fidx = fidx - 1)

    {

        funcname    = GetArrayElement(AR_STR, hfunctionnames, 2*fidx); 

        opcodeslen  = GetArrayElement(AR_LONG, hfunctionnames, 2*fidx+1); 

        hopcodes    = GetArrayId(funcname);

        for(oidx = 0; oidx &lt; opcodeslen; oidx = oidx + 1)

        {

            byte = GetArrayElement(AR_LONG, hopcodes, oidx);

            PatchByte(iwriteaddr, byte);

            if(writetofile == 1)

            {

                fileoffset = GetFileOffset(iwriteaddr, imagebase, virtualsectionoffset, rawsectionoffset);

                writeerror = fseek(hfile, fileoffset, 0);

                writeerror = fputc(byte, hfile);

                if(writeerror == -1)

                {

                    Message(&quot;Could not write to file (RVA %x)&quot;,iwriteaddr);

                    return;

                }

                Message(&quot;Write byte %x to file offset %x\n&quot;, byte, fileoffset);

            }

            

            iwriteaddr = iwriteaddr + 1;

        }

    }

}



static main()

{

    auto hfile, e_lfanew, imagebase, virtualsectionoffset, rawsectionoffset, writetofile, section;

    auto didx, inumberoffunctions, hfunctionnames, hpermutation, hfunctionaddresses, hnewfunctionaddresses;

    auto haddresstranslationlut, hglobalvars;

    

    writetofile              = 1;

    

    // This is init stuff and should be wrapped into a separate init function

    if(writetofile == 1)

    {

        section              = 0;

        hfile                = GetFileHandle(&quot;rb&quot;

        e_lfanew             = GetPointerToPEHeader(hfile);

        imagebase            = GetImageBase(hfile, e_lfanew);

        virtualsectionoffset = GetVirtualSectionOffset(hfile, e_lfanew, section);

        rawsectionoffset     = GetRawSectionOffset(hfile, e_lfanew, section);

        

        hglobalvars          = CreateArray(&quot;GlobalVars&quot;

        if(hglobalvars == -1)

        {

            // If array already exist get the handle by GetArrayId

            hglobalvars = GetArrayId(&quot;GlobalVars&quot;

        }

        SetArrayLong(hglobalvars, 0, imagebase);

        SetArrayLong(hglobalvars, 1, virtualsectionoffset);

        SetArrayLong(hglobalvars, 2, rawsectionoffset);

        fclose(hfile);

        hfile                = GetFileHandle(&quot;r+&quot;

    }

    

    // Get number of functions 

    inumberoffunctions = GetNumberOfFunctions();

    

    // DEBUG

    // Message(&quot;Number of functions %d\n&quot;,inumberoffunctions);



    // Create permutation array

    hpermutation = CreatePermutation(inumberoffunctions);

    

    // Get current function addresses

    hfunctionaddresses = GetFunctionAddresses();

    

    // Get addresses after permutation

    hnewfunctionaddresses = GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions);    



    // Pre-processing, create address translation lookup table

    haddresstranslationlut = CreateAddressTranslationLUT(hnewfunctionaddresses);

    

    PatchInPlace(haddresstranslationlut);

    

    //DEBUG  

    //for(didx = 0; didx&lt;inumberoffunctions; didx++)

    //{

    //    Message(&quot;New Function address: %x\n&quot;, GetArrayElement(AR_LONG, hnewfunctionaddresses, didx));

    //}

    //return;

    

    // This array is populated with names of functions and

    // the length of the functions in dwords in the following

    // way [name1,length1,name2,length2,...]

    hfunctionnames = CreateArray(&quot;FunctionNames&quot;

     

    if(hfunctionnames == -1)

    {

        // If array already exist get the handle by GetArrayId

        Message(&quot;hfunctionnames is -1.\n&quot;

        hfunctionnames = GetArrayId(&quot;FunctionNames&quot;

    }

 

    //  Enumerate functions and store them i persistent array

    inumberoffunctions = EnumerateAndStoreFunctions(hfunctionnames);    



    // Print functions in IDA's output window

    PrintFunctions(hfunctionnames, inumberoffunctions);

    

    // Write Back functions in reversed order

    WriteBackFunctions(hfunctionnames, inumberoffunctions, 0x401000, writetofile, hfile);

 

    if(writetofile == 1)

    {

        fclose(hfile);

    }

 

    MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);

    AnalyzeArea (MinEA(), MaxEA());      

}



niaren
December 22nd, 2010, 16:51

After running the below script on watermark1.exe I learned about Base relocations. It's something that prevents the 'dewatermarked' exe from running 



The below script now takes care of correcting the entry point as well so it should (in theory) work on an exe with fixed base or stripped relocation info. 

If I manually, in a hex-editor, correct the relevant values in the relocation directory, the 'dewatermarked' exe runs without any problems. As watermark1.exe and watermark2.exe were built with relocation information and dynamic base it would most likely be considered cheating if the script isn't updated to correct relocation information as well 



Hopefully there will be an Xmas version of the IDC script that takes care of the relocation information but most likely it will be a new year edition  





#include &lt;idc.idc&gt; // Mandatory include directive



static GetFileHandle(mode)

{

    auto hFile;

    

    hFile = fopen(GetInputFilePath(), mode);

    if (0 == hFile)

    {

        Message(&quot;Cannot open \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }

    return hFile;

}



static GetPointerToPEHeader(hfile)

{

    auto e_lfanew;

    

    // Seek to the e_lfanew field 

    if (0 != fseek(hfile, 0x3C, 0))

    {

        Message(&quot; 1 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;, handle: %x&quot;, hfile);

    }



    // Read the value of e_lfanew

    e_lfanew = readlong(hfile, 0);



    // Seek to IMAGE_NT_HEADERS

    if (0 != fseek(hfile, e_lfanew, 0))

    {

        Message(&quot; 2 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;, handle: %x, elfanew: %x\n&quot;, hfile, e_lfanew);

    }



    // Read the Signature

    if (0x00004550 != readlong(hfile, 0))

    {

        Message(&quot;Not a valid PE file&quot;

    }

    return e_lfanew;

}



static GetImageBase(hfile, e_lfanew)

{

    auto imageBase;

    

    // Seek to the IMAGE_NT_HEADERS.OptionalHeader.ImageBase field

    if (0 != fseek(hfile, e_lfanew + 0x18 + 0x1C, 0))

    {

        Fatal(&quot; 3 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }

    imageBase = readlong(hfile, 0);

    return imageBase;

}



static GetVirtualSectionOffset(hfile, e_lfanew, section)

{

    auto numberOfSections, sectionRva;

    

    // Seek to the IMAGE_FILE_HEADER.NumberOfSections field

    if (0 != fseek(hfile, e_lfanew + 0x06, 0))

    {

        Fatal(&quot; 4 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }



    // Read the number of sections

    numberOfSections = readshort(hfile, 0);

    

    if (section &gt;= numberOfSections)

    {

        Fatal(&quot;Invalid section&quot;

    }



    // Seek to the desired section

    if (0 != fseek(hfile, e_lfanew + 0xF8 + section * 0x28 + 0x0C, 0))

    {

        Fatal(&quot; 5 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }



    sectionRva = readlong(hfile, 0);

    return sectionRva;

}



static GetRawSectionOffset(hfile, e_lfanew, section)

{

    auto pointerToRawData;

    

    // Seek to the desired section

    if (0 != fseek(hfile, e_lfanew + 0xF8 + section * 0x28 + 0x14, 0))

    {

        Fatal(&quot; 6 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }



    pointerToRawData = readlong(hfile, 0);

    return pointerToRawData;

}



static GetFileOffset(rva, imagebase, virtualsectionoffset, rawsectionoffset)

{

    return rva - imagebase - virtualsectionoffset + rawsectionoffset;

}



static GetNumberOfFunctions()

{

    auto addr, name, fidx;

    addr = 0;

    fidx = 0; // function index

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return fidx;

        }

        fidx = fidx + 1;        

    }

    return fidx;

}



static CreatePermutation(inumberoffunctions)

{

    auto hpermutation;

    

    hpermutation = CreateArray(&quot;Permutation&quot;

    if(hpermutation == -1)

    {

        // If array already exist get the handle by GetArrayId

        hpermutation = GetArrayId(&quot;Permutation&quot;

    }

    // Hardcoded permutation

    SetArrayLong(hpermutation, 0, 2);

    SetArrayLong(hpermutation, 1, 1);

    SetArrayLong(hpermutation, 2, 0);

    return hpermutation;

}

    

static GetFunctionAddresses()

{

    auto addr, name, fidx, hfunctionaddresses;

    addr = 0;

    fidx = 0; // function index

    

    hfunctionaddresses = CreateArray(&quot;FunctionAddresses&quot;

    if(hfunctionaddresses == -1)

    {

        // If array already exist get the handle by GetArrayId

        hfunctionaddresses = GetArrayId(&quot;FunctionAddresses&quot;

    }



    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return hfunctionaddresses;

        }

        SetArrayLong(hfunctionaddresses, 2*fidx, addr);

        SetArrayLong(hfunctionaddresses, 2*fidx+1, NextFunction(addr) - addr);

        

        fidx = fidx + 1;

    }

    return hfunctionaddresses;

}



static GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions)

{

    auto addr, pidx, fidx, hnewfunctionaddresses;

    addr = 0;

    pidx = 0;

    fidx = 0;

    

    hnewfunctionaddresses = CreateArray(&quot;NewFunctionAddresses&quot;

    if(hnewfunctionaddresses == -1)

    {

        // If array already exist get the handle by GetArrayId

        hnewfunctionaddresses = GetArrayId(&quot;NewFunctionAddresses&quot;

    }

    

    // Address of first function

    addr = NextFunction(addr);

    

    fidx = GetArrayElement(AR_LONG, hpermutation, pidx); 

    SetArrayLong(hnewfunctionaddresses, fidx, addr);

    addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);

    

    for(pidx=1; pidx &lt; inumberoffunctions; pidx++)

    {

        fidx = GetArrayElement(AR_LONG, hpermutation, pidx); 

        SetArrayLong(hnewfunctionaddresses, fidx, addr);

        addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);

    }

    return hnewfunctionaddresses;

}



static CreateAddressTranslationLUT(hnewfunctionaddresses)

{

    auto addr, haddresstranslationlut, name, end, inst, newaddr, fidx;

    addr = 0;

    fidx = 0;

    

    haddresstranslationlut = CreateArray(&quot;AddressTranslationLookupTable&quot;

    if(haddresstranslationlut == -1)

    {

        // If array already exist get the handle by GetArrayId

        haddresstranslationlut = GetArrayId(&quot;AddressTranslationLookupTable&quot;

    }

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return haddresstranslationlut;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        // Get new base address of function

        newaddr = GetArrayElement(AR_LONG, hnewfunctionaddresses, fidx);

        

        SetArrayLong(haddresstranslationlut, inst, newaddr);

        Message(&quot;haddresstranslationlut %x \n&quot;,haddresstranslationlut);

        inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        while(inst &lt; end)

        {

            SetArrayLong(haddresstranslationlut, inst, newaddr + (inst-addr));

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        }

        fidx = fidx + 1;

    }

    return haddresstranslationlut;

}



static PatchInPlaceDebug(haddresstranslationlut)

{

    auto addr, name, end, inst, newaddr;

    addr = 0;

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        while(inst &lt; end)

        {

            Message(&quot;Address %x mapped to %x\n&quot;,inst,GetArrayElement(AR_LONG, haddresstranslationlut, inst));

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        }

    }

}



static PatchInPlace(haddresstranslationlut)

{

    auto addr, name, end, inst, newaddr, opidx, optype, newrva, nearaddr;

    addr = 0;

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        while(inst &lt; end)

        {

            opidx = 0;

            optype = GetOpType(inst,opidx);

            while(optype &gt; 0)

            {

                if(optype == 7)

                {

                    // Immediate Near Address

                    

                    // Maybe not necessary but check for call instruction

                    if(GetMnem(inst) == &quot;call&quot

                    {

                        Message(&quot;Instruction at %x being patched.\n&quot;, inst);

                        nearaddr = LocByName(GetOpnd(inst, opidx));

                        newrva   = GetArrayElement(AR_LONG, haddresstranslationlut, nearaddr) - (GetArrayElement(AR_LONG, haddresstranslationlut, inst)+0x6);

                        PatchDword(inst+0x1, newrva+0x1);

                        if(nearaddr == BADADDR)

                        {

                            Message(&quot;Fatal error, error processing instruction at %x\n&quot;, inst);

                        }

                    }

                    else

                    {

                        Message(&quot;Unsupported! Unknown %s instruction needs to be patched.\n&quot;, GetMnem(inst));

                    }

                        

                }

                

                opidx++;

                optype = GetOpType(inst,opidx);

            }  

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);        }

    }

}



static EnumerateAndStoreFunctions(hfunctionnames)

{

    auto addr, tmpaddr, name, fidx, widx, bsuccess, tmphandle, inextfunction;

    addr = 0;

    fidx = 0; // function index

    widx = 0; // word idx

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return fidx;

        }

        

        bsuccess = SetArrayString(hfunctionnames, 2*fidx, name);

        if(bsuccess == 0)

        {

            Message(&quot;Saving name of function %s failed.&quot;,name); 

        }

    

        tmphandle = CreateArray(name);

        if(tmphandle == -1)

        {

            tmphandle = GetArrayId(name);

        }



        inextfunction = NextFunction(addr);

        if(inextfunction == BADADDR)

        {

            inextfunction = GetFunctionAttr(addr, FUNCATTR_END);

        }

        

        widx = 0;

        for(tmpaddr = addr; tmpaddr &lt; inextfunction; tmpaddr = tmpaddr + 1)

        {

             SetArrayLong(tmphandle, widx, Byte (tmpaddr));

             widx = widx + 1;

        }

        bsuccess = SetArrayLong(hfunctionnames, 2*fidx+1, widx);

        fidx = fidx + 1;        

    }

    return fidx;

}



static PrintFunctions(hfunctionnames, inumberoffunctions)

{

    auto fidx;

    for(fidx = 0; fidx &lt; inumberoffunctions; fidx = fidx + 1)

    {

        Message(&quot;Function: %s\n&quot;, GetArrayElement(AR_STR, hfunctionnames, 2*fidx));

    }

}



static WriteBackFunctions(hfunctionnames, inumberoffunctions, iwriteaddr, writetofile, hfile)

{

    auto fidx, oidx, funcname, hopcodes, opcodeslen;

    auto imagebase, virtualsectionoffset, rawsectionoffset;

    auto writeerror, byte, hglobalvars, fileoffset;

    

    if(writetofile == 1)

    {

        hglobalvars          = GetArrayId(&quot;GlobalVars&quot;

        imagebase            = GetArrayElement(AR_LONG, hglobalvars, 0);

        virtualsectionoffset = GetArrayElement(AR_LONG, hglobalvars, 1);

        rawsectionoffset     = GetArrayElement(AR_LONG, hglobalvars, 2);

    }

    

    // DEBUG

    Message(&quot;imagebase: %x, virtualsectionoffset: %x, rawsectionoffset: %x\n&quot;,imagebase,virtualsectionoffset,rawsectionoffset);

    

    for(fidx = 2; fidx &gt;=0 ; fidx = fidx - 1)

    {

        funcname    = GetArrayElement(AR_STR, hfunctionnames, 2*fidx); 

        opcodeslen  = GetArrayElement(AR_LONG, hfunctionnames, 2*fidx+1); 

        hopcodes    = GetArrayId(funcname);

        for(oidx = 0; oidx &lt; opcodeslen; oidx = oidx + 1)

        {

            byte = GetArrayElement(AR_LONG, hopcodes, oidx);

            PatchByte(iwriteaddr, byte);

            if(writetofile == 1)

            {

                fileoffset = GetFileOffset(iwriteaddr, imagebase, virtualsectionoffset, rawsectionoffset);

                writeerror = fseek(hfile, fileoffset, 0);

                writeerror = fputc(byte, hfile);

                if(writeerror == -1)

                {

                    Message(&quot;Could not write to file (RVA %x)&quot;,iwriteaddr);

                    return;

                }

                Message(&quot;Write byte %x to file offset %x\n&quot;, byte, fileoffset);

            }

            

            iwriteaddr = iwriteaddr + 1;

        }

    }

}



static main()

{

    auto hfile, e_lfanew, imagebase, virtualsectionoffset, rawsectionoffset, writetofile, section;

    auto didx, inumberoffunctions, hfunctionnames, hpermutation, hfunctionaddresses, hnewfunctionaddresses;

    auto haddresstranslationlut, hglobalvars, main, call2main, newrva, fileoffset, writeerror;

    

    writetofile              = 1;

    

    // This is init stuff and should be wrapped into a separate init function

    if(writetofile == 1)

    {

        section              = 0;

        hfile                = GetFileHandle(&quot;rb&quot;

        e_lfanew             = GetPointerToPEHeader(hfile);

        imagebase            = GetImageBase(hfile, e_lfanew);

        virtualsectionoffset = GetVirtualSectionOffset(hfile, e_lfanew, section);

        rawsectionoffset     = GetRawSectionOffset(hfile, e_lfanew, section);

        

        hglobalvars          = CreateArray(&quot;GlobalVars&quot;

        if(hglobalvars == -1)

        {

            // If array already exist get the handle by GetArrayId

            hglobalvars = GetArrayId(&quot;GlobalVars&quot;

        }

        SetArrayLong(hglobalvars, 0, imagebase);

        SetArrayLong(hglobalvars, 1, virtualsectionoffset);

        SetArrayLong(hglobalvars, 2, rawsectionoffset);



        // Get address of main

        main                 = LocByName(&quot;_main&quot;

        if(main == BADADDR)

        {

            Message(&quot;Could not find _main. Aborting...\n&quot;

            return;

        }

        call2main = RfirstB(main);

        if(GetMnem(call2main) != &quot;call&quot

        {

            Message(&quot;Expecting to find call to _main. Unsuccessful. Aborting...\n&quot;

            return;

        }

        

        fclose(hfile);

        hfile                = GetFileHandle(&quot;r+&quot;

    }

    

    // Get number of functions 

    inumberoffunctions = GetNumberOfFunctions();

    

    // DEBUG

    // Message(&quot;Number of functions %d\n&quot;,inumberoffunctions);



    // Create permutation array

    hpermutation = CreatePermutation(inumberoffunctions);

    

    // Get current function addresses

    hfunctionaddresses = GetFunctionAddresses();

    

    // Get addresses after permutation

    hnewfunctionaddresses = GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions);    



    // Pre-processing, create address translation lookup table

    haddresstranslationlut = CreateAddressTranslationLUT(hnewfunctionaddresses);

    

    PatchInPlace(haddresstranslationlut);



    // Fix call to _main

    if(writetofile == 1)

    {

        if(GetOpType(call2main,0) != 7)

        {

            Message(&quot;Unexpected operand found at call2main. Aborting...\n&quot;

            return;

        }

        newrva   = GetArrayElement(AR_LONG, haddresstranslationlut, main) - (call2main+0x6);

        PatchDword(call2main+0x1, newrva+0x1);  



        fileoffset = GetFileOffset(call2main+1, imagebase, virtualsectionoffset, rawsectionoffset);

        writeerror = fseek(hfile, fileoffset, 0);

        writeerror = writelong(hfile, newrva+0x1, 0);

        if(writeerror == -1)

        {

            Message(&quot;Could not patch call2main (newrva %x)&quot;, newrva);

            return;

        }

        Message(&quot;Write long %x to file offset %x\n&quot;, newrva, fileoffset);

    }

    

    //DEBUG  

    //for(didx = 0; didx&lt;inumberoffunctions; didx++)

    //{

    //    Message(&quot;New Function address: %x\n&quot;, GetArrayElement(AR_LONG, hnewfunctionaddresses, didx));

    //}

    //return;

    

    // This array is populated with names of functions and

    // the length of the functions in dwords in the following

    // way [name1,length1,name2,length2,...]

    hfunctionnames = CreateArray(&quot;FunctionNames&quot;

     

    if(hfunctionnames == -1)

    {

        // If array already exist get the handle by GetArrayId

        Message(&quot;hfunctionnames is -1.\n&quot;

        hfunctionnames = GetArrayId(&quot;FunctionNames&quot;

    }

 

    //  Enumerate functions and store them i persistent array

    inumberoffunctions = EnumerateAndStoreFunctions(hfunctionnames);    



    // Print functions in IDA's output window

    PrintFunctions(hfunctionnames, inumberoffunctions);

    

    // Write Back functions in reversed order

    WriteBackFunctions(hfunctionnames, inumberoffunctions, 0x401000, writetofile, hfile);

 

    if(writetofile == 1)

    {

        fclose(hfile);

    }

 

    MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);

    AnalyzeArea (MinEA(), MaxEA());      

}



dELTA
December 25th, 2010, 20:27
Very nice work niaren!

Next steps would be to update all locations that IDA classifies as addresses (if the executable has a relocation table, they should all be in there though, so in that case you already got the jackpot, but the relocs might be stripped from executables (contrary to DLLs) and this would make the script useless on them [first protector counter-measure, woo]), and then the little more complicated inter-functional offsets (note: offsets != addresses). If relocating code on function-level rather that object file-level, you might even have to mutate the code in more complex ways than just patching addresses in order to fix these inter-functional offsets (since the new relocated offset might need more bits space than the original one needed), but if not, you can ignore them completely I would think, since there would not be any in the object file-level case.

After this, I guess tests on more and more complex executables is the way to go, until they possibly crash after your dewatermarking, and then analyze their crash/disassembly in IDA to see what kind of special case cause the script not to work, then implement support for this special case, and then iterate the procedure until the dewatermarking code shuffling produces a working IDA executable.

After that, in order to make it a serious "generic dewatermarker", my suggested steps are probably these (as loosely mentioned in my previous posts too):


Import table shuffler (should be easy as long as all code that is statically linked to imports is correctly analyzed by IDA)
Export table shuffler (should be easy no matter what)
PE resource directory shuffler (should be easy under normal conditions, I think)
Relocation table shuffler (this might be covered by what you already mention is your planned next step, but I cannot bring myself to remember if the contents of the relocation table can have arbitrary order, or if they must be ordered by relocated address - in the former case you should always randomly reshuffle the order of the relocations too, to eradicate any watermark entropy that might be hidden in this ordering).
Code-location-independent function diffing tool (checking for any differences within functions that are not related to their location (and thus neglecting differences relating to call and jump addresses/offsets in the code, but detecting all other differences), e.g. to see if there are any differences in used instructions in sub areas of functions etc. Do note that not all addresses/offsets can/should be ignored during this process though, only those related to jumps/calls in the code, since otherwise entropy can be hidden in e.g. the ordering of data in data sections!
Data area diffing tool (detecting differences in default data section contents).
Non PE-section data diffing tool (diffing content of non-PE-section parts of executable files, e.g. PE headers, code caves or data inserted between, before or after PE section areas in the executable.


For any detected differences in code sections, you must mutate the affected functions with a code obfuscation algorithm to hope to remove any watermarking entropy hidden in their original implementation. This algorithm should at least obfuscate/morph instruction ordering and substitute instructions or instruction sequences with semantically equivalent instructions or instructions sequences. Only "changing their CRC" with simple antivirus evasion-style obfuscation (insertion of nops, xor decryption layer etc) won't remove much entropy from the possibility of recovery by manual analysis.

All code differences must also be analyzed manually by the reverser running the script/differ though, since a final resort of the protector might be to generate semantically different code in each watermarked version (e.g. setting a register to a serial number in some stray instruction somewhere), which would then not be removed by the above code obfuscation techniques. It could then be concluded by the reverser to be superfluous to proper program operation though, and thus completely nopped out instead.

If you follow this advice in your implementation, you'll have a pretty damn capable (and unique) generic dewatermarker tool in you hands I'd say.

disavowed
December 29th, 2010, 11:11
This is a bit off-topic since it's not about watermarking by linking order, but you may want to take a look at http://www.woodmann.com/crackz/Tutorials/Tsehpida.htm ("Reversing IDA 4.01 - Watermarked protection scheme" nonetheless.

dELTA
December 29th, 2010, 22:58
It's always good to be familiar with your history, thanks for the reference disa.

niaren
January 1st, 2011, 15:34

dELTA, I haven't really thought about the case where relocation information is stripped or about the inter-functional offset issue. That is a small set back  Let's see, have no clue right now how big a problem it is. As you also mentions, let's do it (small) step-by-step trial and error 

I also understand your point that the watermark leaves traces in the other directories and as such that would have to be 'taken care of' as well. I agree with you on everything 



I've just set up IDAPython now and will start rewrite the IDC script in python and see if there will be many problems with this. Unless there are some problems I hope this means working smarter not harder and if that means being a masochist then yes I'm that! 



At least this example works, one line that prints out all functions and their addresses 




print [(hex(func),GetFunctionName(func)) for func in Functions(SegStart(ScreenEA()),SegEnd(ScreenEA()))]


dELTA
January 1st, 2011, 18:36
Sounds great, then we are looking much forward to your next status report, sitting here ready to answer any further questions.

dELTA
January 6th, 2011, 17:34
Short after posting my last message above, I came to think about adding TLS directory shuffling to my list of necessary things for "complete de-watermarkation" above.

BanMe just mentioned something else regarding TLS in another thread ("http://www.woodmann.com/forum/showthread.php?13985-can-anyone-tell-me-why-code-such-as-this-avoids-access-violation-on-write..&p=88774&viewfull=1#post88774"), directed at you niaren, and seemingly more specifically aimed at the discussion in this thread, so I'll include it here too:

Quote:
[Originally Posted by BanMe]*side note to niaren* What happens if the image has tls functions and these are not included in the reloc section.. I saw a function that relocs Tls as well somewhere but just noted it for interest..
Actually, all parts of the PE specification should be gone through carefully to make sure that there are none left in which entropy could be hidden. My short enumeration in previous post were just approximate and off the top of my head.

You yourself (niaren) also mentioned above the more general problem of absolute addresses not being included in the reloc table (e.g. because there is no reloc table to begin with, due to reloc stripping).

I'd say that as long as all code in the executable is free from encryption/obfuscation, IDA will already have parsed up and classified all such addresses in the code for you, ripe and ready for your picking, so not much trouble at all really. Also, if any such addresses are part of a watermark (and thus differ between different copies of the executable) you will find them easily with the "Code-location-independent function diffing tool" that I mention above, since this tool will only ignore offsets and absolute addresses explicitly mentioned in the reloc table (or even just identified by IDA).

Finally, encrypted/obfuscated code containing watermark data will of course also be easily identified by the data and code diffing tools I mention.

niaren
January 6th, 2011, 18:01
dELTA, I keep getting distracted but I have begun the task of porting the IDC script to python Hopefully, I can do it this weekend. I will post here as soon as it is done.
Thanks a lot, you're extremely helpful

dELTA
January 7th, 2011, 10:05
Sounds great, looking forward to your progress reports.

This is a very good learning project (IDA scripting, PE structure, code anatomy, etc) and it would also be really cool if a good generic watermark detection/destruction tool would come out of it.

niaren
January 14th, 2011, 17:41

In order to make things even more simple the makefile has been changed such that

  - dependent functions are linked into the exe file (/MD  -&gt;  /MT)

      Without this modification renaming the exefile would give an error e.g.  'msvcr90.dll not found'

  - Manifest file option is disabled

  - exe file is linked with no dynamic base and fixed base address

      With this modification it is possible to test the code reshuffling without worrying about reloc info.



As a side comment how many functions would you guess is identified/found by IDA inside the 'new' exe? See the answer below 

(remember that we only defined 3 functions)



With these modifications the below IDC script and Python script both reorders the functions (the 3 functions made by us) in the exe file and the exe file actually runs afterwards  







Things to do to (in priorized order) before moving on to other parts of the exe as pointed out by dELTA

- automatically find the range of functions that the script can reorder (hardcoded now)

- make python code look nice (subclass of PE class from pefile)

- make IDA output look nice (kayaker ideas)



New makefile

<div style="margin:20px; margin-top:5px"><div class="smallfont" style="margin-bottom:2px">Code:</div><pre class="alt2" style="margin:0px; padding:6px; border:solid 1px; width:90%; height:80px; overflow:auto"><div dir="ltr" style="text-align:left;">







SRCS = main.c file1.c file2.c



OBJS1 = main.obj file1.obj file2.obj

OBJS2 = file2.obj file1.obj main.obj 



CC        = CL

CCFLAGS   = /O2 /Oi /D &quot;_MBCS&quot; /FD /EHsc /MT /Gy /W3 /c /Zi /TC

            



LINK       = link

LINKFLAGS1 = &quot;/OUT:watermark1.exe&quot; /MANIFEST:NO /OPT:REF /OPT:ICF /DYNAMICBASE:NO /FIXED /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib 

LINKFLAGS2 = &quot;/OUT:watermark2.exe&quot; /MANIFEST:NO /OPT:REF /OPT:ICF /DYNAMICBASE:NO /FIXED /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib 





EC = echo

RM = del





default: all





clean:

    @$(RM) /F *.obj

    @$(RM) /F *.idb

    @$(RM) /F *.pdb

    @$(RM) /F *.exe

    @$(RM) /F *manifest*



%.obj : %.c 

    &quot;C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat&quot;

    @$(EC) ************************************************

    @$(EC) * Comiling $@

    $(CC)  $(CCFLAGS) $&lt;



watermark1.exe: $(OBJS1)

    &quot;C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat&quot;

    $(LINK) $(LINKFLAGS1) $(OBJS1)

    $(LINK) $(LINKFLAGS2) $(OBJS2)







all: watermark1.exe

</div></pre></div>



IDC script



#include &lt;idc.idc&gt; // Mandatory include directive



static GetFileHandle(mode)

{

    auto hFile;

    

    hFile = fopen(GetInputFilePath(), mode);

    if (0 == hFile)

    {

        Message(&quot;Cannot open \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }

    return hFile;

}



static GetPointerToPEHeader(hfile)

{

    auto e_lfanew;

    

    // Seek to the e_lfanew field 

    if (0 != fseek(hfile, 0x3C, 0))

    {

        Message(&quot; 1 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;, handle: %x&quot;, hfile);

    }



    // Read the value of e_lfanew

    e_lfanew = readlong(hfile, 0);



    // Seek to IMAGE_NT_HEADERS

    if (0 != fseek(hfile, e_lfanew, 0))

    {

        Message(&quot; 2 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;, handle: %x, elfanew: %x\n&quot;, hfile, e_lfanew);

    }



    // Read the Signature

    if (0x00004550 != readlong(hfile, 0))

    {

        Message(&quot;Not a valid PE file&quot;

    }

    return e_lfanew;

}



static GetImageBase(hfile, e_lfanew)

{

    auto imageBase;

    

    // Seek to the IMAGE_NT_HEADERS.OptionalHeader.ImageBase field

    if (0 != fseek(hfile, e_lfanew + 0x18 + 0x1C, 0))

    {

        Fatal(&quot; 3 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }

    imageBase = readlong(hfile, 0);

    return imageBase;

}



static GetVirtualSectionOffset(hfile, e_lfanew, section)

{

    auto numberOfSections, sectionRva;

    

    // Seek to the IMAGE_FILE_HEADER.NumberOfSections field

    if (0 != fseek(hfile, e_lfanew + 0x06, 0))

    {

        Fatal(&quot; 4 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }



    // Read the number of sections

    numberOfSections = readshort(hfile, 0);

    

    if (section &gt;= numberOfSections)

    {

        Fatal(&quot;Invalid section&quot;

    }



    // Seek to the desired section

    if (0 != fseek(hfile, e_lfanew + 0xF8 + section * 0x28 + 0x0C, 0))

    {

        Fatal(&quot; 5 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }



    sectionRva = readlong(hfile, 0);

    return sectionRva;

}



static GetRawSectionOffset(hfile, e_lfanew, section)

{

    auto pointerToRawData;

    

    // Seek to the desired section

    if (0 != fseek(hfile, e_lfanew + 0xF8 + section * 0x28 + 0x14, 0))

    {

        Fatal(&quot; 6 Cannot seek in \&quot;&quot; + GetInputFile() + &quot;\&quot;&quot;

    }



    pointerToRawData = readlong(hfile, 0);

    return pointerToRawData;

}



static GetFileOffset(rva, imagebase, virtualsectionoffset, rawsectionoffset)

{

    return rva - imagebase - virtualsectionoffset + rawsectionoffset;

}



static GetNumberOfFunctions()

{

    auto addr, name, fidx;

    addr = 0;

    fidx = 0; // function index

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return fidx;

        }

        fidx = fidx + 1;        

    }

    return fidx;

}



static CreatePermutation(inumberoffunctions)

{

    auto hpermutation;

    

    hpermutation = CreateArray(&quot;Permutation&quot;

    if(hpermutation == -1)

    {

        // If array already exist get the handle by GetArrayId

        hpermutation = GetArrayId(&quot;Permutation&quot;

    }

    // Hardcoded permutation

    SetArrayLong(hpermutation, 0, 2);

    SetArrayLong(hpermutation, 1, 1);

    SetArrayLong(hpermutation, 2, 0);

    return hpermutation;

}

    

static GetFunctionAddresses()

{

    auto addr, name, fidx, hfunctionaddresses;

    addr = 0;

    fidx = 0; // function index

    

    hfunctionaddresses = CreateArray(&quot;FunctionAddresses&quot;

    if(hfunctionaddresses == -1)

    {

        // If array already exist get the handle by GetArrayId

        hfunctionaddresses = GetArrayId(&quot;FunctionAddresses&quot;

    }



    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if( (name == &quot;_pre_cpp_init&quot ||  (fidx==3) 

        {

            return hfunctionaddresses;

        }

        SetArrayLong(hfunctionaddresses, 2*fidx, addr);

        SetArrayLong(hfunctionaddresses, 2*fidx+1, NextFunction(addr) - addr);

        

        fidx = fidx + 1;

    }

    return hfunctionaddresses;

}



static GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions)

{

    auto addr, pidx, fidx, hnewfunctionaddresses;

    addr = 0;

    pidx = 0;

    fidx = 0;

    

    hnewfunctionaddresses = CreateArray(&quot;NewFunctionAddresses&quot;

    if(hnewfunctionaddresses == -1)

    {

        // If array already exist get the handle by GetArrayId

        hnewfunctionaddresses = GetArrayId(&quot;NewFunctionAddresses&quot;

    }

    

    // Address of first function

    addr = NextFunction(addr);

    

    fidx = GetArrayElement(AR_LONG, hpermutation, pidx); 

    SetArrayLong(hnewfunctionaddresses, fidx, addr);

    addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);

    

    for(pidx=1; pidx &lt; inumberoffunctions; pidx++)

    {

        fidx = GetArrayElement(AR_LONG, hpermutation, pidx); 

        SetArrayLong(hnewfunctionaddresses, fidx, addr);

        addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);

    }

    return hnewfunctionaddresses;

}



static CreateAddressTranslationLUT(hnewfunctionaddresses)

{

    auto addr, haddresstranslationlut, name, end, inst, newaddr, fidx;

    addr = 0;

    fidx = 0;

    

    haddresstranslationlut = CreateArray(&quot;AddressTranslationLookupTable&quot;

    if(haddresstranslationlut == -1)

    {

        // If array already exist get the handle by GetArrayId

        haddresstranslationlut = GetArrayId(&quot;AddressTranslationLookupTable&quot;

    }

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if( (name == &quot;_pre_cpp_init&quot || (fidx==3))

        {

            return haddresstranslationlut;

        }

    end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        // Get new base address of function

        newaddr = GetArrayElement(AR_LONG, hnewfunctionaddresses, fidx);

        

        SetArrayLong(haddresstranslationlut, inst, newaddr);

        Message(&quot;haddresstranslationlut %x -&gt; %x \n&quot;, inst, newaddr);

        inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        while(inst &lt; end)

        {

            SetArrayLong(haddresstranslationlut, inst, newaddr + (inst-addr));

                        Message(&quot;haddresstranslationlut %x -&gt; %x \n&quot;, inst, newaddr + (inst-addr));

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        }

        fidx = fidx + 1;

    }

    return haddresstranslationlut;

}



static PatchInPlaceDebug(haddresstranslationlut)

{

    auto addr, name, end, inst, newaddr;

    addr = 0;

    

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if(name == &quot;_pre_cpp_init&quot

        {

            return;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        while(inst &lt; end)

        {

            Message(&quot;Address %x mapped to %x\n&quot;,inst,GetArrayElement(AR_LONG, haddresstranslationlut, inst));

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        }

    }

}



static PatchInPlace(haddresstranslationlut)

{

    auto addr, name, end, inst, newaddr, opidx, optype, newrva, nearaddr, fidx;

    auto nearaddrnew;

    addr = 0;

    fidx = 0;

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if( (name == &quot;_pre_cpp_init&quot  || (fidx == 3) 

        {

            return;

        }

        end  = GetFunctionAttr(addr, FUNCATTR_END);

        inst = addr;

        

        while(inst &lt; end)

        {

            opidx = 0;

            optype = GetOpType(inst,opidx);

            while(optype &gt; 0)

            {

                if(optype == 7)

                {

                    // Immediate Near Address

                    

                    // Maybe not necessary but check for call instruction

                    if(GetMnem(inst) == &quot;call&quot

                    {

                        Message(&quot;Instruction at %x being patched.\n&quot;, inst);

                        nearaddr = LocByName(GetOpnd(inst, opidx));

                        Message(&quot;Operand near address is %x\n&quot;, nearaddr);

                        nearaddrnew = GetArrayElement(AR_LONG, haddresstranslationlut, nearaddr);

                        Message(&quot;Looking up near address: %x\n&quot;, nearaddrnew);

                        if(nearaddrnew == 0)

                        {

                            if(nearaddr == BADADDR)

                            {

                                Message(&quot;Fatal error, error processing instruction at %x\n&quot;, inst);

                            }

                            nearaddrnew = nearaddr;

                        }

                        newrva   = nearaddrnew - (GetArrayElement(AR_LONG, haddresstranslationlut, inst)+0x6);

                        PatchDword(inst+0x1, newrva+0x1);

                        

                    }

                    else

                    {

                        Message(&quot;Unsupported! Unknown %s instruction needs to be patched.\n&quot;, GetMnem(inst));

                    }

                        

                }

                

                opidx++;

                optype = GetOpType(inst,opidx);

            }  

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);



        }

        fidx = fidx + 1;    

    } // end for-loop

    

} // end function



static EnumerateAndStoreFunctions(hfunctionnames)

{

    auto addr, tmpaddr, name, fidx, widx, bsuccess, tmphandle, inextfunction;

    addr = 0;

    fidx = 0; // function index

    widx = 0; // word idx

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))

    {

        name = Name(addr);

        

        // Stop if name of function is _pre_cpp_init

        // It is assumed that compiler/linker generated functions

        // are appended in the end of image and that they start with the

        // _pre_cpp_init function

        if( (name == &quot;_pre_cpp_init&quot || (fidx == 3) 

        {

            return fidx;

        }

        

        bsuccess = SetArrayString(hfunctionnames, 2*fidx, name);

        if(bsuccess == 0)

        {

            Message(&quot;Saving name of function %s failed.&quot;,name); 

        }

    

        tmphandle = CreateArray(name);

        if(tmphandle == -1)

        {

            tmphandle = GetArrayId(name);

        }



        inextfunction = NextFunction(addr);

        if(inextfunction == BADADDR)

        {

            inextfunction = GetFunctionAttr(addr, FUNCATTR_END);

        }

        

        widx = 0;

        for(tmpaddr = addr; tmpaddr &lt; inextfunction; tmpaddr = tmpaddr + 1)

        {

             SetArrayLong(tmphandle, widx, Byte (tmpaddr));

             widx = widx + 1;

        }

        bsuccess = SetArrayLong(hfunctionnames, 2*fidx+1, widx);

        fidx = fidx + 1;        

    }

    return fidx;

}



static PrintFunctions(hfunctionnames, inumberoffunctions)

{

    auto fidx;

    for(fidx = 0; fidx &lt; inumberoffunctions; fidx = fidx + 1)

    {

        Message(&quot;Function: %s\n&quot;, GetArrayElement(AR_STR, hfunctionnames, 2*fidx));

    }

}



static WriteBackFunctions(hfunctionnames, inumberoffunctions, iwriteaddr, writetofile, hfile)

{

    auto fidx, oidx, funcname, hopcodes, opcodeslen;

    auto imagebase, virtualsectionoffset, rawsectionoffset;

    auto writeerror, byte, hglobalvars, fileoffset;

    

    if(writetofile == 1)

    {

        hglobalvars          = GetArrayId(&quot;GlobalVars&quot;

        imagebase            = GetArrayElement(AR_LONG, hglobalvars, 0);

        virtualsectionoffset = GetArrayElement(AR_LONG, hglobalvars, 1);

        rawsectionoffset     = GetArrayElement(AR_LONG, hglobalvars, 2);

    }

    

    // DEBUG

    Message(&quot;imagebase: %x, virtualsectionoffset: %x, rawsectionoffset: %x\n&quot;,imagebase,virtualsectionoffset,rawsectionoffset);

    

    for(fidx = 2; fidx &gt;=0 ; fidx = fidx - 1)

    {

        funcname    = GetArrayElement(AR_STR, hfunctionnames, 2*fidx); 

        opcodeslen  = GetArrayElement(AR_LONG, hfunctionnames, 2*fidx+1); 

        hopcodes    = GetArrayId(funcname);

        for(oidx = 0; oidx &lt; opcodeslen; oidx = oidx + 1)

        {

            byte = GetArrayElement(AR_LONG, hopcodes, oidx);

            PatchByte(iwriteaddr, byte);

            if(writetofile == 1)

            {

                fileoffset = GetFileOffset(iwriteaddr, imagebase, virtualsectionoffset, rawsectionoffset);

                writeerror = fseek(hfile, fileoffset, 0);

                writeerror = fputc(byte, hfile);

                if(writeerror == -1)

                {

                    Message(&quot;Could not write to file (RVA %x)&quot;,iwriteaddr);

                    return;

                }

                Message(&quot;Write byte %x to file offset %x\n&quot;, byte, fileoffset);

            }

            

            iwriteaddr = iwriteaddr + 1;

        }

    }

}



static main()

{

    auto hfile, e_lfanew, imagebase, virtualsectionoffset, rawsectionoffset, writetofile, section;

    auto didx, inumberoffunctions, hfunctionnames, hpermutation, hfunctionaddresses, hnewfunctionaddresses;

    auto haddresstranslationlut, hglobalvars, main, call2main, newrva, fileoffset, writeerror, fidx;

    

    writetofile              = 1;

    

    // This is init stuff and should be wrapped into a separate init function

    if(writetofile == 1)

    {

        section              = 0;

        hfile                = GetFileHandle(&quot;rb&quot;

        e_lfanew             = GetPointerToPEHeader(hfile);

        imagebase            = GetImageBase(hfile, e_lfanew);

        virtualsectionoffset = GetVirtualSectionOffset(hfile, e_lfanew, section);

        rawsectionoffset     = GetRawSectionOffset(hfile, e_lfanew, section);

        

        hglobalvars          = CreateArray(&quot;GlobalVars&quot;

        if(hglobalvars == -1)

        {

            // If array already exist get the handle by GetArrayId

            hglobalvars = GetArrayId(&quot;GlobalVars&quot;

        }

        SetArrayLong(hglobalvars, 0, imagebase);

        SetArrayLong(hglobalvars, 1, virtualsectionoffset);

        SetArrayLong(hglobalvars, 2, rawsectionoffset);



        // Get address of main

        main                 = LocByName(&quot;_main&quot;

        if(main == BADADDR)

        {

            Message(&quot;Could not find _main. Aborting...\n&quot;

            return;

        }

        call2main = RfirstB(main);

        if(GetMnem(call2main) != &quot;call&quot

        {

            Message(&quot;Expecting to find call to _main. Unsuccessful. Aborting...\n&quot;

            return;

        }

        

        fclose(hfile);

        hfile                = GetFileHandle(&quot;r+&quot;

    }

    

    // Get number of functions 

    inumberoffunctions = GetNumberOfFunctions();

    inumberoffunctions = 3;

        

    // DEBUG

    Message(&quot;Number of functions %d\n&quot;,inumberoffunctions);



    // Create permutation array

    hpermutation = CreatePermutation(inumberoffunctions);

    

    // Get current function addresses

    hfunctionaddresses = GetFunctionAddresses();

    for(fidx = 0; fidx &lt; inumberoffunctions; fidx++)

    {

        Message(&quot;Function address: %x\n&quot;, GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx));

    }

    

        

    // Get addresses after permutation

    hnewfunctionaddresses = GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions);

    for(fidx = 0; fidx &lt; inumberoffunctions; fidx++)

    {

        Message(&quot;New Function address: %x\n&quot;, GetArrayElement(AR_LONG, hnewfunctionaddresses, fidx));

    }

    

    

    // Pre-processing, create address translation lookup table

    haddresstranslationlut = CreateAddressTranslationLUT(hnewfunctionaddresses);

      

    PatchInPlace(haddresstranslationlut);

   



    // Fix call to _main

    if(writetofile == 1)

    {

        if(GetOpType(call2main,0) != 7)

        {

            Message(&quot;Unexpected operand found at call2main. Aborting...\n&quot;

            return;

        }

        newrva   = GetArrayElement(AR_LONG, haddresstranslationlut, main) - (call2main+0x6);

        PatchDword(call2main+0x1, newrva+0x1);  



        fileoffset = GetFileOffset(call2main+1, imagebase, virtualsectionoffset, rawsectionoffset);

        writeerror = fseek(hfile, fileoffset, 0);

        writeerror = writelong(hfile, newrva+0x1, 0);

        if(writeerror == -1)

        {

            Message(&quot;Could not patch call2main (newrva %x)&quot;, newrva);

            return;

        }

        Message(&quot;Write long %x to file offset %x\n&quot;, newrva, fileoffset);

    }

    

    //DEBUG  

    //for(didx = 0; didx&lt;inumberoffunctions; didx++)

    //{

    //    Message(&quot;New Function address: %x\n&quot;, GetArrayElement(AR_LONG, hnewfunctionaddresses, didx));

    //}

    //return;

    

    // This array is populated with names of functions and

    // the length of the functions in dwords in the following

    // way [name1,length1,name2,length2,...]

    hfunctionnames = CreateArray(&quot;FunctionNames&quot;

     

    if(hfunctionnames == -1)

    {

        // If array already exist get the handle by GetArrayId

        Message(&quot;hfunctionnames is -1.\n&quot;

        hfunctionnames = GetArrayId(&quot;FunctionNames&quot;

    }

 

    //  Enumerate functions and store them i persistent array

    inumberoffunctions = EnumerateAndStoreFunctions(hfunctionnames);    



    // Print functions in IDA's output window

    PrintFunctions(hfunctionnames, inumberoffunctions);

    

    // Write Back functions in reversed order

    WriteBackFunctions(hfunctionnames, inumberoffunctions, 0x401000, writetofile, hfile);

 

    if(writetofile == 1)

    {

    fclose(hfile);

    }

 

    MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);

    AnalyzeArea (MinEA(), MaxEA());      

}







Python script (this script creates a new permutation for each invocatioin)



import pefile

import random

from collections import defaultdict





intsize = 32

magic   = 2**intsize

def dec2hex(x, m=magic):

    if x&lt;0:

        return magic+x

    else:

        return x

    

class DEWA(pefile.PE):

    

    def PrintStuff(self):

        print self.DOS_HEADER.e_lfanew

        print self.OPTIONAL_HEADER.ImageBase

        print self.sections[0].PointerToRawData

        print self.sections[0].VirtualAddress



dewa = DEWA('C:\\rce\\LinkOrder\\manifestless\\watermark1.exe')

dewa.PrintStuff()



main = LocByName(&quot;_main&quot

if (main &lt;= 0):

    print &quot;Unexpected result: could not find _main, LocByName(\&quot;_main\&quot&quot;

    sys.exit(1)

    

print &quot;address of main is &quot; + hex(main)

print &quot;entry point read from file is &quot; + hex(dewa.OPTIONAL_HEADER.AddressOfEntryPoint)

call2main = RfirstB(main)

if(GetMnem(call2main) != &quot;call&quot:

    Message(&quot;Unexpected result: Expecting to find call to _main. Unsuccessful. Aborting...&quot

    sys.exit(1)



# Get functions

funcs = []

for idx,func in enumerate(Functions()):

    if Name(func) == &quot;_pre_cpp_init&quot;:

        break

    if idx &gt; 2:

        break

    funcs.append(func)



funclens  = []

func2func = []

# Loop over the function and check if they represent a contiguous block

for idx in range(len(funcs)-1):

    curfunc = funcs[idx]

    end_addr = GetFunctionAttr(curfunc, FUNCATTR_END)

    funclens.append(end_addr - curfunc)

    func2func.append(funcs[idx+1] - curfunc)

    for byte in range(end_addr, funcs[idx+1]):

        if Byte(byte) != 204:  # 0xCC

            print &quot;Warning: Test for continuity failed.&quot;

            print &quot;func number %i at address %s&quot; % (idx, hex(curfunc))



# Last function is treated separately

end_addr = GetFunctionAttr(funcs[-1], FUNCATTR_END)

funclens.append(end_addr-funcs[-1])

last     = NextFunction(funcs[-1])

if last &gt; 0:

    func2func.append(last - funcs[-1])

else:

    func2func.append(funclens[-1])    

    

print funclens

print func2func

# Get number of functions

no_functions = len(funcs)



# Make a permutation

funcorder = range(no_functions)



# make a copy of funcorder

new_funcorder = funcorder[:]

random.shuffle(new_funcorder)



# check the permutation

while True:

    count = 0

    for idx in range(no_functions):

        if funcorder[idx] == new_funcorder[idx]:

            count = count + 1

    if count &gt; 0:

        new_funcorder = funcorder[:]

        random.shuffle(new_funcorder)

    else:

        break



##print funcorder

##map(print hex(funcs), funcs)

funcmap = dict()

curaddr = funcs[0]

for idx in xrange(no_functions):

    curfunc = funcs[new_funcorder[idx]]

    funcmap[hex(curfunc)] = hex(curaddr)

    curaddr = curaddr + func2func[new_funcorder[idx]]

 

print new_funcorder    

print funcmap    



# Build Address Translation LUT

atlut = dict()

for func in funcs:

    func_end = GetFunctionAttr(func, FUNCATTR_END)

    newfuncaddr = funcmap[hex(func)]

    inst = func

    atlut[hex(inst)] = newfuncaddr

    inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT)

    while inst &lt; func_end:

        atlut[hex(inst)] = hex(int(newfuncaddr,16) + (inst-func));

    inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);



print atlut

print &quot;*************************************&quot;

# Patch in-place

for func in funcs:

    func_end = GetFunctionAttr(func, FUNCATTR_END)

    instr    = func

    while instr &lt; func_end:

        print &quot;Considering %s.&quot; % hex(instr)

        opidx  = 0

        optype = GetOpType(instr, opidx)

        while optype &gt; 0:

            if(optype == 7):

                # Immediate near address

                # Maybe not necessary but check for call instruction

                if(GetMnem(instr) == 'call'):

                    print &quot;Instruction at %s being pathed.&quot; % hex(instr)

                    nearaddr = LocByName(GetOpnd(instr, opidx))

                    print hex(nearaddr)

                    print &quot;atlut[hex(instr)] is &quot; + atlut[hex(instr)]

                    if (hex(nearaddr) in atlut):

                        newrva   = int(atlut[hex(nearaddr)],16) - (int(atlut[hex(instr)],16)+0x6)

                    else:

                        if (nearaddr == 0):

                            print &quot;Fatal error! error processing instruction at %s&quot; % hex(instr)

                            sys.exit(1)

                        else:

                            newrva   = nearaddr - (int(atlut[hex(instr)],16)+0x6)

                    print &quot;newrva is &quot; + str(newrva) + &quot; in hex &quot; + hex(newrva)

                    PatchDword(instr+1, dec2hex(newrva)+1)

                else:

                    print &quot;Unsupported! Unknown %s instruction needs to be patch&quot; % GetMnem(instr)



            opidx = opidx + 1

            optype = GetOpType(instr, opidx)



        instr = FindCode(instr, SEARCH_DOWN | SEARCH_NEXT);



# Enumerate and store functions

funcdict = defaultdict(list)

funcidx  = 0 # used for indexing into funclens list

for func in funcs:

    funclen = func2func[funcidx]

    for addr in range(func,func+funclen):

        funcdict[funcidx].append(Byte(addr))

    funcidx = funcidx + 1



print funcdict



imbase = dewa.OPTIONAL_HEADER.ImageBase

# layout the functions in the new order

writeaddr = funcs[0]

print hex(writeaddr)

for fidx in new_funcorder:

    print fidx

    for codebyte in funcdict[fidx]:

        print hex(writeaddr), hex(codebyte)

        PatchByte(writeaddr, codebyte)

        dewa.set_bytes_at_rva(writeaddr-imbase, chr(codebyte))

        writeaddr = writeaddr + 1



# Correct entry point

if(GetOpType(call2main,0) != 7):

    print &quot;Unexpected operand found at call2main. Aborting...\n&quot;

    sys.exit(1)



print atlut[hex(main)]

print int(atlut[hex(main)],16)

print call2main

print newrva

newrva   = int(atlut[hex(main)],16) - (call2main+0x6);

PatchDword(call2main+0x1, dec2hex(newrva+1));  

dewa.set_dword_at_rva(call2main+1-imbase, dec2hex(newrva+1))



# Commit to file / save to disk

dewa.write('C:\\rce\\LinkOrder\\manifestless\\dewatermark1.exe')











Answer: 235!?


dELTA
January 14th, 2011, 19:44
Looking great niaren.

About this:
Quote:
[Originally Posted by niaren]automatically find the range of functions that the script can reorder (hardcoded now)
Shouldn't that always be "all of them"?

niaren
January 15th, 2011, 16:46

<div style="margin:20px; margin-top:5px; "><div class="smallfont" style="margin-bottom:2px">Quote:</div><table cellpadding="6" cellspacing="0" border="1" width="90%"><tr><td class="alt2" style="border:1px inset"><i>[Originally Posted by dELTA;89088]

Shouldn't that always be &quot;all of them&quot;?

</i></td></tr></table></div>



My assumption was that the 'overhead/auxillary functions' introduced by the compiler are always appended to the exe file. As such they don't contribute to the watermark. However, instead of creating a more complicated exe file for further development and testing we might just as well try to reorder all the functions in the watermark1.exe file instead of just the 3 functions. I have tried to do that. 



The result is an improved python script (se bottom). With this script it is possible to reorder a range of functions that obey a continuity constraint. As it turns out the first 9 functions in the watermark1.exe file fulfill this constraint and can be reordered with the script.



Continuity constraint:

A consecutive sequence of functions, as identified by IDA, is said to be continuous if the space in between functions (if there is any) consists of sequences of 0xCC (int 3) bytes.



In other words, we expect to find 0xCC sequences in between functions in order to align functions to paragraphs. If this is not the case we have no real control of what is going on with the bytes in between the functions. It could very well be code.



I will now depict a scenario in the watermark1.exe file where this continuity constraint is not obeyed and which furthermore turns out to be relative hard to deal with in de-watermarking terms (at least I think so ).



Below is shown the IDA output for watermark1 where sub_401276 is the 10th function (where first function has lowest address). We see that the bytes following the end of function sub_401276 are not identified (or at least not indicated) by IDA as a function. We could regard these bytes as code and move them. However, there are no references (Xrefs) to the code. Therefore we have no direct way to take care of patching the code that calls this code, if it gets called. 



<div style="margin:20px; margin-top:5px"><div class="smallfont" style="margin-bottom:2px">Code:</div><pre class="alt2" style="margin:0px; padding:6px; border:solid 1px; width:90%; height:80px; overflow:auto"><div dir="ltr" style="text-align:left;">

.text:0040126C                 public $LN25

.text:0040126C $LN25           proc near             &lt;-------------  OEP (not interesting)

.text:0040126C                 call    ___security_init_cookie

.text:00401271                 jmp     ___tmainCRTStartup

.text:00401271 $LN25           endp

.text:00401271

.text:00401276

.text:00401276 ; =============== S U B R O U T I N E =======================================

.text:00401276

.text:00401276

.text:00401276 sub_401276      proc near               ; CODE XREF: sub_40102C:loc_401063p

.text:00401276                                         ; sub_40102C+4Ep ...

.text:00401276                 mov     eax, offset off_40C008

.text:0040127B                 retn

.text:0040127B sub_401276      endp

.text:0040127B

.text:0040127C ; ---------------------------------------------------------------------------

.text:0040127C

.text:0040127C ___initstdio:                     &lt;- instructions/data located in between functions

.text:0040127C                 mov     eax, dword_40EAC0

.text:00401281                 push    esi

.text:00401282                 push    14h

.text:00401284                 pop     esi

.text:00401285                 test    eax, eax

.text:00401287                 jnz     short loc_401290

.text:00401289                 mov     eax, 200h

.text:0040128E                 jmp     short loc_401296

</div></pre></div>



In fact, this code gets called by this piece of code (inside __initterm_e function)



<div style="margin:20px; margin-top:5px"><div class="smallfont" style="margin-bottom:2px">Code:</div><pre class="alt2" style="margin:0px; padding:6px; border:solid 1px; width:90%; height:80px; overflow:auto"><div dir="ltr" style="text-align:left;">

esi holds the address of an array of addresses, the loop is executed until a non NULL pointer is found and the execution directed to the function with the corresponding address

.text:004026C4 loc_4026C4:                             ; CODE XREF: __initterm_e+1Fj

.text:004026C4                 test    eax, eax

.text:004026C6                 jnz     short loc_4026D8

.text:004026C8                 mov     ecx, [esi]

.text:004026CA                 test    ecx, ecx

.text:004026CC                 jz      short loc_4026D0

.text:004026CE                 call    ecx

.text:004026D0

.text:004026D0 loc_4026D0:                             ; CODE XREF: __initterm_e+15j

.text:004026D0                 add     esi, 4

.text:004026D3

.text:004026D3 loc_4026D3:                             ; CODE XREF: __initterm_e+Bj

.text:004026D3                 cmp     esi, [ebp+arg_4]

.text:004026D6                 jb      short loc_4026C4

</div></pre></div>



This seems like a quite difficult situation to handle from a de-watermarking point of view. One possible approach could be to search data section for 0040127C values...but that doesn't seem very feasible.



Updated python script



import pefile

import random

from collections import defaultdict





intsize = 32

magic   = 2**intsize

def dec2hex(x, m=magic):

    if x&lt;0:

        return magic+x

    else:

        return x

    

class DEWA(pefile.PE):



    def GetFunctionFromVA(self, va):

        for func in Functions():

            if func &gt; va:

                f = PrevFunction(func)

                return f



        

    def CheckInstr(self, instr):

        mnem = GetMnem(instr)

        if(mnem == &quot;call&quot or (mnem == &quot;jmp&quot :

            if(Byte(instr) == 0xeb):  # jmp short

                return False

            if(Byte(instr) == 0xff):  # call indirect

                return False

            

            # Immediate near address

            if(GetOpType(instr, 0) != 7):

                print &quot;Unexpected %s operand %i found at %s. Aborting...\n&quot; % (mnem, GetOpType(instr, 0), hex(instr))

                return False

            else:

                nearaddr = LocByName(GetOpnd(instr, 0))

                if(nearaddr == BADADDR):

                    print &quot;Error locating near address at %s. Aborting...\n&quot; % (hex(instr))

                    return False

                else:

                    # check that near address falls outside of current function

                    fcur = self.GetFunctionFromVA(instr)

                    if(fcur == BADADDR):

                        print &quot;GetFunctionFromVA Error at %s. Aborting...\n&quot; % (hex(instr))

                        return

                    fnext = NextFunction(fcur)

                    if(fnext == BADADDR):

                        print &quot;NextFunction Error at %s. Aborting...\n&quot; % (hex(instr))

                        return                       

                    if(((instr+nearaddr) &gt;= fcur) and ((instr+nearaddr) &lt;= fnext)):

                        print &quot;instr %s, fcur %s, fnext %s\n&quot; % (hex(instr),hex(fcur),hex(fnext))

                        return False

                    else:

                        return True

        else:

            return False





    def PatchInstr(self, instr, atlut):

        mnem = GetMnem(instr)

        if( (mnem == &quot;call&quot or (mnem == &quot;jmp&quot :

            nearaddr = LocByName(GetOpnd(instr, 0))

            if (hex(nearaddr) in atlut):

                newrva   = int(atlut[hex(nearaddr)],16) - (int(atlut[hex(instr)],16)+0x6)

            else:

                newrva   = nearaddr - (int(atlut[hex(instr)],16)+0x6)

            print &quot;oldrva: &quot; + hex(nearaddr) + &quot; newrva is &quot; + str(newrva) + &quot; in hex &quot; + hex(newrva)

            PatchDword(instr+1, dec2hex(newrva)+1)

            



    def PatchXrefInstr(self, instr, addr, atlut, imbase):

        newrva   = int(atlut[hex(addr)],16) - (instr+0x6);

        PatchDword(instr+0x1, dec2hex(newrva+1));  

        self.set_dword_at_rva(instr+1-imbase, dec2hex(newrva+1))



    

    def PrintStuff(self):

        print self.DOS_HEADER.e_lfanew

        print self.OPTIONAL_HEADER.ImageBase

        print self.sections[0].PointerToRawData

        print self.sections[0].VirtualAddress





def idapymain():

    dewa = DEWA('C:\\rce\\LinkOrder\\manifestless\\watermark1.exe')

    dewa.PrintStuff()



    imbase = dewa.OPTIONAL_HEADER.ImageBase

    

    main = LocByName(&quot;_main&quot

    if (main &lt;= 0):

        print &quot;Unexpected result: could not find _main, LocByName(\&quot;_main\&quot&quot;

        return

        

    print &quot;address of main is &quot; + hex(main)

    print &quot;entry point read from file is &quot; + hex(dewa.OPTIONAL_HEADER.AddressOfEntryPoint)



    # Get functions

    funcs = []

    for idx,func in enumerate(Functions()):

        if Name(func) == &quot;_pre_cpp_init&quot;:

            break

        if idx &gt; 9:

            break

        funcs.append(func)



    funclens  = []

    func2func = []

    # Loop over the function and check if they represent a contiguous block

    for idx in range(len(funcs)-1):

        curfunc = funcs[idx]

        end_addr = GetFunctionAttr(curfunc, FUNCATTR_END)

        funclens.append(end_addr - curfunc)

        func2func.append(funcs[idx+1] - curfunc)

        for byte in range(end_addr, funcs[idx+1]):

            if Byte(byte) != 204:  # 0xCC

                print &quot;Warning: Test for continuity failed.&quot;

                print &quot;func number %i at address %s&quot; % (idx, hex(curfunc))



    # Last function is treated separately

    end_addr = GetFunctionAttr(funcs[-1], FUNCATTR_END)

    funclens.append(end_addr-funcs[-1])

    last     = NextFunction(funcs[-1])

    if last &gt; 0:

        func2func.append(last - funcs[-1])

    else:

        func2func.append(funclens[-1])    

        

    print funclens

    print func2func

    # Get number of functions

    no_functions = len(funcs)



    # Make a permutation

    funcorder = range(no_functions)



    # make a copy of funcorder

    new_funcorder = funcorder[:]

    random.shuffle(new_funcorder)



    # check the permutation

    while True:

        count = 0

        for idx in range(no_functions):

            if funcorder[idx] == new_funcorder[idx]:

                count = count + 1

        if count &gt; 0:

            new_funcorder = funcorder[:]

            random.shuffle(new_funcorder)

        else:

            break



    ##print funcorder

    ##map(print hex(funcs), funcs)

    funcmap = dict()

    curaddr = funcs[0]

    for idx in xrange(no_functions):

        curfunc = funcs[new_funcorder[idx]]

        funcmap[hex(curfunc)] = hex(curaddr)

        curaddr = curaddr + func2func[new_funcorder[idx]]

     

    print new_funcorder    

    print funcmap    



    # Build Address Translation LUT

    atlut = dict()

    for func in funcs:

        func_end = GetFunctionAttr(func, FUNCATTR_END)

        newfuncaddr = funcmap[hex(func)]

        inst = func

        atlut[hex(inst)] = newfuncaddr

        inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT)

        while inst &lt; func_end:

            atlut[hex(inst)] = hex(int(newfuncaddr,16) + (inst-func));

            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);



    print atlut

    print &quot;*** Build ATLUT end *** \n\n&quot;

        

    print &quot;*** Handle xrefs start *** \n&quot;

    

    for idx,func in enumerate(funcs):

        xref = RfirstB(func)

        while(xref != BADADDR):

            if( hex(xref) not in atlut):

                if (dewa.CheckInstr(xref) == False):

                    print &quot;Unexpected Xref instruction found at %s. Aborting...\n&quot; % (hex(xref))

                    return

                print &quot;Reference to %i %s found at %s\n&quot; % (idx, hex(func), hex(xref))

                dewa.PatchXrefInstr(xref, func, atlut, imbase)

            xref = RnextB(func, xref)

            

    print &quot;*** Handle xrefs end *** \n\n&quot;

    

    print &quot;*** Patch in-place *** \n&quot;

    

    # Patch in-place

    for func in funcs:

        func_end = GetFunctionAttr(func, FUNCATTR_END)

        instr    = func

        while instr &lt; func_end:

            if(dewa.CheckInstr(instr)==True):

                print &quot;Instruction at %s being pathed.&quot; % hex(instr)

                dewa.PatchInstr(instr, atlut)



            instr = FindCode(instr, SEARCH_DOWN | SEARCH_NEXT);



    # Enumerate and store functions

    funcdict = defaultdict(list)

    funcidx  = 0 # used for indexing into funclens list

    for func in funcs:

        funclen = func2func[funcidx]

        for addr in range(func,func+funclen):

            funcdict[funcidx].append(Byte(addr))

        funcidx = funcidx + 1



    print funcdict



    # layout the functions in the new order

    writeaddr = funcs[0]

    print hex(writeaddr)

    for fidx in new_funcorder:

        print fidx

        for codebyte in funcdict[fidx]:

            #print hex(writeaddr), hex(codebyte)

            PatchByte(writeaddr, codebyte)

            dewa.set_bytes_at_rva(writeaddr-imbase, chr(codebyte))

            writeaddr = writeaddr + 1



    # Check entry point

    oep = dewa.OPTIONAL_HEADER.AddressOfEntryPoint + imbase

    print &quot;OEP is %s\n&quot; % hex(oep)

    if (hex(oep) in atlut):

        print &quot;OEP is changed to %s\n&quot; % atlut[hex(oep)]

        dewa.OPTIONAL_HEADER.AddressOfEntryPoint = int(atlut[hex(oep)],16) - imbase

    

    # Commit to file / save to disk

    dewa.write('C:\\rce\\LinkOrder\\manifestless\\dewatermark1.exe')

    print &quot;Done!\n&quot;

    

idapymain()



dELTA
January 15th, 2011, 22:07
Nice to see your progress.

I understand that indirect calls that are not caught in the IDA xrefs will pose a big problem. But if the full address of the called function is read from the data section, it must also be in the reloc data of the executable, so as long as relocs are not stripped, we should still be ok, right?

Also, for the cases with stripped relocs, maybe the de-watermarking tool could at least helpfully provide a list of all indirect call instructions in the entire program, so that they could be analyzed manually by the user, and then manually entered as resolved xrefs into IDA before proceeding with the final working de-watermarking procedure?

I have no idea how many this is in a normal program, but at least the program has then done all it can to help, which would be the goal - nothing more can be demanded. And if the user wants, he can then analyze/resolve/enter all of these and their resolved xrefs manually in IDA, and then again proceed with the tool to actually get a working de-watermarked executable!