Log in

View Full Version : little state of art in decompilation


Shub-nigurrath
January 11th, 2005, 04:45
A while back I found an interesting paper "Reverse Engineering and the Computing Profession" by C. Cifuentes, Computer Dec 2001, pp.166-168

Inside there were an interesting sidebar about the state of the art in Decompilation and RCE in general, which had some interesting links, some of which new to me also.

This come in my hands again recently in electronic format, and despite the paper dates back 2001, the information inside are still valid. I decided to share it here for your convenience, to whom's interested with the sake of being useful to some1.

Quote:

State of the Art
American and European researchers have developed several good high-level tools for reverse engineering. Tools such as Rigi (http://www.rigi.csc.uvic.ca), PBS (http://swag.uwaterloo.ca/pbs/), and GUPRO (http://www.gupro.de/) aid in program understanding and software architecture recovery.
Other tools, such as SHriMP (http://www.thechiselgroup.org/shrimp), significantly contribute to understanding a large piece of software through different visualization techniques.

Most tools focus on one aspect of reverse engineering. They may specialize in parsing code well, producing different types of graph views, or producing architecture diagrams in UML. Unlike program transformation tools such as compilers, which build an intermediate representation in memory and apply transformations to that representation internally, reverse engineering tools tend to cooperate with each other to support different parts of the reverse engineering process.

The Graph eXchange Language (http://www.gupro.de/GXL/) has simplified reverse-engineering tool interoperability. GXL, based on XML, resulted from an international collaboration between researchers in academia and industry. An extensible language, it supports any graph-based data format. For example, you can describe a control flow graph or an abstract syntax tree in GXL.
A key difficulty in using reverse engineering tools arises from their need to support a variety of languages or be capable of extension to support another language. Although building them has often proven to be a daunting task, tools have been successfully designed to convert Cobol code to C, and to translate C code to C++ or Java code.

Some low-level reverse engineering tools have also been successful, including interactive commercial disassemblers such as IDA Pro (http://www.datarescue.com/idabase/ida.htm) and Sourcer (www.v-com.com/product/Sourcer_Home.html), which provide good quality assembly code for a variety of machines. During the Y2K crisis, a few companies provided decompilation services for Cobol binaries because many large organizations have vast legacy applications written in that language. Java decompilers have also been written—more easily than some other decompilers because the Java program’s binary format is not machine code but rather an intermediate representation called Java bytecodes.
Otherwise, most decompilation techniques are supported manually: The engineer decompiles assembly code mentally and annotates the representation with its high-level equivalent.