OHPen
November 5th, 2012, 10:08
Hi,
I'm currently working on very large binary round about 40 MB. Because I'm facing frequent updates on the binary I have to write a plugin which automatically identifies all the functions I already have reverse engineered ( sure, I mean only the unchanged ones... ). I don't want my comments and changes to be applied manually on and on and on, all the time. I know there are free tools out for binary diffing, but i need a custom tool. My approach would be to retrieve all relevant data of a certain function and hash whatever all parts which are somehow static ( excluding opcodes which have relocations applied on and so on... ). Afterwards I would store the signature + information in a database, so that pattern can be used for searching the next time. If a special signature is no longer found ( lets say after the 3rd update of the binary, i discard it, because i can assume that this code was either heavily changed and thus have to be reverse engineered manually again or it was removed )
I've never done this but now i need it because of the size and the frequent changes of the binary.
What do you think ?
What is the most efficient way to approach this ? I have no problem to invest few weeks on this, so even more complex ideas are welcome. Looking forward to your replys
!!
Thx in advanced!
Regards,
OHPen
I'm currently working on very large binary round about 40 MB. Because I'm facing frequent updates on the binary I have to write a plugin which automatically identifies all the functions I already have reverse engineered ( sure, I mean only the unchanged ones... ). I don't want my comments and changes to be applied manually on and on and on, all the time. I know there are free tools out for binary diffing, but i need a custom tool. My approach would be to retrieve all relevant data of a certain function and hash whatever all parts which are somehow static ( excluding opcodes which have relocations applied on and so on... ). Afterwards I would store the signature + information in a database, so that pattern can be used for searching the next time. If a special signature is no longer found ( lets say after the 3rd update of the binary, i discard it, because i can assume that this code was either heavily changed and thus have to be reverse engineered manually again or it was removed )
I've never done this but now i need it because of the size and the frequent changes of the binary.
What do you think ?
What is the most efficient way to approach this ? I have no problem to invest few weeks on this, so even more complex ideas are welcome. Looking forward to your replys

Thx in advanced!
Regards,
OHPen