Log in

View Full Version : A Bit of a Teaser, And Feedback Request


L. Spiro
April 26th, 2006, 00:46
I have been working on a general hacking tool called Memory Hacking Software.

It has a searcher, debugger, disassembler, code injector, and some other things.

The reason for this post is that I am about to add a new feature and before it is done I would like a bit of feedback to ensure its maximum potential is reached.


I am adding a full programming language to the software.
The language itself is stand-alone; if I distribute the library, it can be used in any application to quickly add script support to that application (Windows x86 only).

Currently the primary features of the language are:
Compiled into bytecode like Java: Fast to execute.
90% C syntax: No need to learn a new language.
Structs/Unions: Not classes, but does support structures and unions like C.
Full pointer functionality: Allows &, *, and -> pointer-indirection operators and ultimately allows the same flexibility when working with RAM as C.
Two "interface" layers:
The base layer allows linking the language code to actual C/C++ functions, which means the language can call your custom functions written in a DLL.

The second layer is an "external" interface that allows the application to control I/0 flow and even code-execution flow. I will explain the intended purpose for this later.

Two-pass compilation: No need to predeclare globals, functions, enums, structs, or unions, in most cases.
Preprocessing: Full preprocessing support, including #include and #pragma, and all preprocessing directives related to macros.





I am at a point now in the language's creation process where the foundation of the language, which is already definite and concrete (the byte-code emition for all expressions and statements) is nearly done, and I can begin working on the upper levels of the language that could be subject to change, should I get new ideas.

That is why I am writing now.

I am at the point where I have begun creating the basic standard API (StrCmp(), StrCat(), MemCpy(), Malloc(), Free(), etc.)
Firstly, are there any specific standard API functions of an odd sort that one would consider valuable to have in the language?
Note that if I miss any functions, those functions can be written in the language itself anyway, but I am looking at ease in use, trying to create a full API.



I now need to explain my intention for interface layer 2.
My intent for this layer, when used with Memory Hacking Software, is to allow the user to declare variables and functions inside the target process.
For example:
Code:
extern DWORD g_dwHealth = { "gamex86.dll", 0x443DC }; // Declare g_dwHealth as a variable in the target process.
g_dwHealth++; // Increase the value in the target process.

This way you can manipulate the RAM of the target process as if it is local in your own code.
The read/write operations on g_dwHealth will be redirected via interface layer 2 to read/write the value in the target process, outside of Memory Hacking Software.
This is a new feature in programming and I hope to make the most of it, which is the reason I am posting now.

The second part of layer 2 is declaring functions in the target process.
Code:
extern __stdcall DWORD RuleWorld( BOOL bMakeHumansIntoPets, DWORD dwNumberOfYourBaseThatAreBelongToUs ) = { "gamex86.dll", 0x463D3 };

This would redirect the code execution to the actual code in the target process, then return the result back to your local script.
Which means it becomes easy to call functions in the target process and work with them.
For example:
Code:
g_dwHealth = RuleWorld( TRUE, 32 ); // Call the function in the target process and assign the return to a value in the target process.

if ( RuleWorld( FALSE, g_dwHealth ) == 90 ) { ; } // Branch based off the return of the function in the target process.





As you can see, there is a lot of potential with this type of system, and I am hoping to maximize that.
I would like to know of any ideas for the API set, and if there are any ideas for extra features for the language such as this one.
Right now I am just at that point where the next things I code make the language final, so before I start on it I need to get all my ideas together.

If there are any ideas you may have for situations where the language would be able to help with any other types of hacking/debugging/etc., please provide your input.


Thank you.
L. Spiro

laola
April 26th, 2006, 03:01
Hi L. Spiro,

great stuff Can't wait to have a look at the final version. Here are my spontaneous 2 cents:

As it seems, you're (right now) just providing support for calling existing functions in the target process. How about a means of injecting code and executing it there? I know that this has already been implemented a zillion times but... It seems to be a good extension to what you already have.

And of course transparent memory dump/load is a must: Functions to save a particular region of the target process' address space or loading that region from a file or other source. Useful to implement savegame functionalities. Besides the trivia, these functions should also be able to take care of guard pages and other nasty things

L. Spiro
April 26th, 2006, 03:34
To keep the language as a stand-alone package, the API of the language is custom-built per application, and defines how the language interacts with that application (what functions the users can call in script, aside from the ones they write in the script itself).

I am making a standard API that ships with the language itself (should a company decide to use it in their applications, they would not need to bother creating the standard "universal" API), then sets of secondary API’s that can be added to the base API, which are specific to my application (Memory Hacking Software).

I plan to include API functions that allow code injection through Memory Hacking Software (using its current code injector, controlled through script).

I was planning to allow the user script access to ReadProcessMemory(), WriteProcessMemory(), and file streams through the API, so the type of dump you want could be constructed entirely from script by the user, but since you mention it I will also look into at least the dumping part and wrap it into a simple API call.
As for reloading into the target process, I will be supplying the base functions to do that so the user can script it together.
I will include custom API functions that allow easy calculations of module offsets and things.


Quote:
As it seems, you're (right now) just providing support for calling existing functions in the target process.

Globally, there are 3 types of functions.
Local: Functions written by the user in the script itself.
Internal: Functions defined by Memory Hacking Software that can be called from the script (these compose the entire API and can change if another application was to use my language as a script service).
External: Functions declared remotely in the target process. These functions are to be found by the user and declared in the script itself (supplying the module and offset in the module) as per my example above.

The scope of “external” functions is unlimited; you can call any function in any module in any process on your computer (though the process chosen is the one you have open in Memory Hacking Software) as long as you know the address, parameters, and calling convention.
This means you can inject code into the target process (either through my script API functions or by any other means) and then call that code from the script.


L. Spiro

Silver
April 26th, 2006, 07:21
That sounds nice, it's similar to something I wrote a while ago. Your version sounds like it has more features though, especially the call-a-function and declare-vars-inside-the-target feature. I'm curious to know how far you've got with this.

One question, you say it's compiled to bytecode like Java - do you mean the output of your compiler runs in a virtual machine you coded (that's how mine works), or that it's compiled to native win32 executable?

Quote:
the API of the language is custom-built per application, and defines how the language interacts with that application


Can you explain this a bit further? Why do you need to custom build your api per target, isn't the base functionality the same for every target and it's the scriptcoder that writes the target-specific code?

L. Spiro
April 26th, 2006, 08:35
Quote:
I'm curious to know how far you've got with this.

All statements and expressions compile and execute except do/while. There is no specific reason for this statement to be last; I just got happy starting on the API functions so I felt it could wait a few days.
Memory Hacking Software can call script functions and get the returns, and the script functions can call functions declared in Memory Hacking Software (or a user DLL loaded into Memory Hacking Software).
That’s the progress with the base language.
The “standard” API may already be totally done, assuming I didn’t forget any of the functions.


Quote:
One question, you say it's compiled to bytecode like Java - do you mean the output of your compiler runs in a virtual machine you coded (that's how mine works), or that it's compiled to native win32 executable?

It is compiled to a custom instruction set that I have written which is then interpreted via a virtual machine, which I have also written.
My instruction set is roughly compatible with ASM and mostly works the same way as compiled C, but I allowed myself a few shortcuts, such that I can have infinite registers (total registers needed for any given code segment is calculated at compile-time; usually it is only 7) and my registers are all 64-bit.




Quote:
Why do you need to custom build your api per target, isn't the base functionality the same for every target and it's the scriptcoder that writes the target-specific code?

Yes and no.
The API is usually how the script interacts with the host software.
A set of functions that perform actions related to the host application, for example, “GetOpenFileCount()” or something of that nature.
So the programmer of the host software needs to expose these functions to the script so that the user can call them from script.
But these functions are not necessarily specific to that application. Some are general, like stdlib. For the ease of the programmer writing the host application, I am wrapping the standard functions together into what I was calling the "base API".
StrCat(), StrLen, MemCpy(), etc., are functions that are generally going to be used in multiple applications, so instead of having the writers of those applications expose those functions to the script, I am doing it.

From there, if a company uses my language for script support in their own applications they can just build upon my base API and quickly add their own application-specific functions. Or they could remove my base API and make their own, or do whatever.

My base API is roughly equal to "#include <stdlib.h>".




The overall progress is pretty far.
The only things remaining in the language itself is that last statement, lots of testing, interface layer 2, the base API, and all preprocessing.
But that isn't enough to release Memory Hacking Software yet.
I need to write the environment and help files, and fully integrate it into Memory Hacking Software.
This means I have to write functions that allow the user to control all parts of Memory Hacking Software from the script.
The script should be able to open/close the Debugger, add/remove breakpoints, handle breakpoint hits, start/stop searches, scan search results, load/control/close the Disassembler and Hex Editor, inject code, load DLLs, etc.

It seems to be a lot of work but really a lot of it is easy and/or copy/paste work.
And I won't wait until it is ALL done before I release the first version.
Good to get hands-on feedback before everything is complete (although enough of it needs to be done so that users don’t think it is crappy).




Just to tease your fancy while giving an example of the capabilities of the language, you could expose a bunch of the DirectX functions to the script and code an entire 2-D game in the script from there. I believe the framerate would be decent and playable, and there are tons of optimizations I am going to work into the compiler once everything else is up-and-running.





L. Spiro

Silver
April 26th, 2006, 10:42
Quote:
So the programmer of the host software needs to expose these functions to the script so that the user can call them from script


Ah, that makes perfect sense now. I think I originally misunderstood part of the purpose of your app.

Quote:
It seems to be a lot of work but really a lot of it is easy and/or copy/paste work.


Yes, that's exactly what I found. Once you have the base structure in, almost everything else is copy and paste with one or 2 minor changes for keywords etc.

Two things for you. First, according to "popular theory", the kind of languages we've both done are about 15* slower than native executables, in other words for every 1 script->bytecode instruction your VM uses 15+ instructions. I found this to be important in silly places. For example, when dealing with arrays your script will have:

Code:
// Assumes myarray is declared string myarray[10];
string mystring = myarray[0];


This will translate to pushing 0, then myarray, then popping the result to mystring. If you add a specific PUSHZERO instruction to your compiler and optimize all push 0, push <var> to PUSHZERO <var>, you're saving 1 virtual machine instruction per variable access, which in turn saves 15+ native win32 instructions. When you think how many vars a single script would use you're quickly saving thousands of instructions. I hope that makes sense.

The Dragon Book is the bible for compiler creation:

(http://www.amazon.com/gp/product/0201100886/002-3398322-0910426?v=glance&n=283155)

Second thing, you should add implicit type conversion to your language (if you haven't already). It's extremely handy to have typed variables that can be implictly converted, eg:

int mynum = 5;
string mystring = "My number is " + mynum;

I find all this very interesting personally so if you do have further thoughts and ideas I'd love to hear them. I might even steal them for my app .

I'll stop polluting your thread now...

L. Spiro
April 26th, 2006, 11:52
Quote:
If you add a specific PUSHZERO instruction to your compiler and optimize all push 0, push <var> to PUSHZERO <var>, you're saving 1 virtual machine instruction per variable access, which in turn saves 15+ native win32 instructions. When you think how many vars a single script would use you're quickly saving thousands of instructions. I hope that makes sense.

This deals with the optimizations I mentioned earlier (though my language already automatically employs optimizations with constants, so if an array index is hard-coded to any number it will just use one instruction and access that index immediately).
It’s exactly along these lines; a large instruction set where each instruction carries out specific tasks or groups several common tasks together so that fewer interpretations are present.

Right now I have over 250 instructions, where each instruction has a single very specific task with specific left and right operands. This avoids further parsing during decoding of my instructions, since each instruction already knows the next byte is a register index or constant value or whatever.

From here, the next optimizations I plan to add are with the ++ and -- operators. For being so common, it is highly beneficial to have a single instruction perform these operations instead of 3 (currently I use the C/C++ method of mov, add, mov).
I have more planned for other areas.



Quote:
The Dragon Book is the bible for compiler creation:

Actually I have already purchased it. It helped in a few areas.



Quote:
Second thing, you should add implicit type conversion to your language

90% of all conversions are already implicit, though I debated doing this because it may open the user to a few more careless mistakes. That’s where the 10% went, where I don’t implicitly convert up and down pointers.
And I do not have a “string” type in my language; it works the same way as C/C++, so the example you gave won’t be available unfortunately.
My reasoning for this is because my language is intended to work intimately with real C/C++ code, and I wish to ensure maximum direct compatibilities.


Another feature is that, using my base API, no memory/resource leaks may occur.
The system will keep track of all resources allocated and freed and ensure that the script can not fail to release handles, close files, or deallocate memory.
This is specific to my API though; I use wrappers for malloc(), realloc(), and free() to do this, though another application may decide to eliminate this feature.

And I borrowed the “synchronized” keyword from Java for threading, so creating critical sections is quite simple, though proned to the same types of errors as with any other threading system.



Quote:
I'll stop polluting your thread now...

I see this as relevant to my purpose; speed is as relevant to the usefulness of the language as anything else.
I would consider anything valid in this thread as long as it relates in some way to improving the language itself, and/or how it is used in Memory Hacking Software.



I am very interested in seeing what any “visionaries” would have to say about it.
That is, people who would think about what types of doors are opened, and what types of doors could be opened from there.
I am really trying to cover all bases with this, and right now I can’t see any more bases that can’t be covered.

Memory Hacking Software will allow the user to load custom DLL’s and expose your custom functions to the script also, which means the script can then call your own real C/C++ functions, at which point the possibilities should become truly unlimited.


But if anyone can see a way to add more possibilities directly into the language or the API, all input is welcome.


L. Spiro

L. Spiro
June 3rd, 2006, 13:54
It’s almost there.

I will release it here when it is done, but before that, what do you think?
hXXp://www.memoryhacking.com/
The 12:45 AM 6/4/2006 update explains more about the language, and here is a brief explanation to go with the picture:
hXXp://www.memoryhacking.com/Minesweeper.png

Here I have used the “extern” feature of my language to easily rip into Minesweeper.
You can follow the code to see how I printed the board and used it to beat Minesweeper in 5 minutes.

Of course, Minesweeper can only offer a basic example, but it should suffice.

A more detailed explanation is in the first link I posted.
Again, I will release this tool here when it is ready, but in the meantime, any feedback or comments are welcome.
I am also open to any questions (“what is the point?”, “what else could it do?”, etc.)


L. Spiro

0xf001
June 3rd, 2006, 18:43
L. Spiro,

I check this out ... sounds really interesting
once i also had an idea about an automated "cracking" tool, but never realized anything. your ideas are somewhat similar but more general which would be the extended luxus version of my ideas haha - i like it

cheers,

0xf001