Log in

View Full Version : How to extract java classes from executable.


Magister
August 26th, 2009, 14:48
Hi everybody.

I want to extract the java classes from an executable.
I have disassembled the executable and discovered that it has been compiled using exe4j compiler.
I discovered for sure that exe4j does not compile the java in native code and thus there should be at least someway to get java classes.
I have failed to find the classes in the executable resources using a resource extractor (i used Resource Hacker) as i read somewhere around.
I know that given the class files i may reverse to source code using JAD or a similar tool, which is my objective.

How, in your opinion, i may extract the classes from the exe?
Thank you!

OHPen
August 27th, 2009, 02:38
@magister: What you want to do is pretty easy. Just write a small program which opens the file in binary read mode. Search the first signature of 0xCAFEBABE. Note the offset and do that again for all other 0xCAFEBABE signatures in your native executable. Reset the file pointer and jump to the first offset you logged in order to read the file fully from that position into a byte array. If you code that in java you can easily pass the resulting byte array to the classloader method "defineClass", what will result in a class object of that class. Now you can use any java binary instrumentation library to serialize the object back to disc. repeat that process for all found 0xCAFEBABE signatures and you are done.

have fun,
OHPen

PS: There are also other solutions, but imo this is the fastest and easiest way to extract classes from a binary blob.

Magister
August 27th, 2009, 03:35
0xCAFEBABE mark both the beginning of the class and the end or just the beginning? If the latter, if i read everything after the signature thru the end of the file, the class loader will know when the class ends even if i gave him more data than needed (like a picture with a specified size, so additional data is ignored)?

I'm going to tinker with that.
Thank you very much, very helpful!

OHPen
August 27th, 2009, 04:43
0xCAFEBABE is the magic each class file starts with. It's NOT found at the end of a class file. There is no way to simply check how long a class file is, thus the approach to feed a byte array with a full class in it + lets call it garbage should not be a problem for the ClassLoader.defineMethod(), because the function calculates the the size of the classfile depending of what is specified in the class file.

Maybe you have luck and is it possible all the classes in your binary are simply concated together. Then it is for sure easier to extract all classes. Just parse from from 0xCAFEBABE magic to next one and write that to file, then parse the class file name out of the "this"-entry in the class file structure to gain the class file name.
I don't think that it will be that easy but it is a try worth. Usually companies develop their own format of storing class files in their protected binaries.

If you have no problem with writing your own java class file parser, which is btw not difficult but time consuming, then the will work in each case.

Regards,
OHPen

dion
August 28th, 2009, 05:59
i was worked on similar exe4j sometime ago. turn out it really easy to figure it out. it maybe differ for every version, but mine, the real content was placed in the end of file incrementally. and, it uses a rather simple encryption method to obfuscating it

Magister
August 28th, 2009, 09:22
I got most of the classes by modifying the defineClass method in classLoader in rt.jar but i still miss many classes as i need the class to be loaded to have it dumped.

Indeed i found the loader classes among the other and it uses some kind of encryption. I think i may now use those loader classes to decode the encrypted classes from the exe. In this way i can be sure to have every classes in the exe, instead of having just the subset of classes which have been loaded during an application run.

Just one more question. How i can identify where is an encrypted class or that a specific part of the exe is an encrypted class when i find it? I suppose the signature won't work if the class in encrypted and i also suppose that in order to decode a class i should know where it starts *and* where it ends, too.

Thank you.

OHPen
August 28th, 2009, 09:47
Hi,

in that case you have to reverse the way the classes are stored in your binary. Or you take a look at the routine which feeds the encrypted class classloader with input. Both shows you how to identify a class.

And, yes, you are right. You have to know where a encrypted class starts and what the size of the encrypted class is in order to decrypt it successfully.
If you just decrypt a buffer which somewhere contains you encrypted class the resulting buffer will be garbage.

Regards,
OHPen

PS: The approach with modifying the ClassLoader class of rt.jar is a good way to dump classfiles at runtime. You can do the same with your EncryptionClassLoader, should work too.