evaluator
May 28th, 2009, 13:17
here i step by step will re-build VMprotect face ~

Code:
VM_handlers (~71)
(SA = StackAdd 4 , SS = StackSub 4)
AddByteByte_SS2
AddWordWord_SS2
AddDwordDword
Div168_SS2
Div3216_SS2
DIV6432
ExitVM
IDIV168_SS2
IDIV3216_SS2
IDIV6432
IMUL88_SS2
IMUL1616_SS4
IMUL3232_SS4
MUL88_SS2
MUL1616_SS4
MUL3232_SS4
NotNotAndByte_SS2
NotNotAndWord_SS2
NotNotAndDword
PopBP
PopEBP
PopfD_SA4
LoadVmIP_SA4
PopMemByte_SA6
PopMemByteSS_SA6
PopMemByteES_SA6
PopMemWord_SA6
PopMemWordSS_SA6
PopMemWordES_SA6
PopMemDword_SA8
PopMemDwordSS_SA8
PopMemDwordES_SA8
PopByteToVMRegsImmID_SA2
PopWordToVMRegsImmID_SA2
PopDwordToVMRegsOpcID_SA4
^^^^
PushByteFromVMRegsImmID_SS2
PushWordFromVMRegsImmID_SS2
PushDwordFromVMRegsOpcID_SS4
^^^^for byte/word/dword parts access in VM-Registers (EAX, AX, AL)
PushBP_SS2
PushEBP_SS4
PushImmUByte_SS2
PushImmSByte_SS4
PushImmWord_SS2
PushImmSWord_SS4
PushImmDword_SS4
PushMemByte_SA2
PushMemByteSS_SA2
PushMemByteES_SA2
PushMemWord_SA2
PushMemWordSS_SA2
PushMemWordES_SA2
PushMemDword
PushMemDwordSS
PushMemDwordES
RclByte_SS2
RclWord_SS2
RclDword_SS2
RcrByte_SS2
RcrWord_SS2
RcrDword_SS2
SHLD_SA2
SHRD_SA2
ShlByte_SS2
ShlDword_SS2
ShlWord_SS2
ShrByte_SS2
ShrDword_SS2
ShrWord_SS2
tool-handlers
PushRDTSC_SS8
PushCPUID_SS12 (value)
CRCsum_SA4 (pmem, size)
**
all Logical & Arithmetic Handlers, which must care on EFlags, has code to store Eflags:
pushfd
pop d, [ebp+0]
then after such handler VM will always call PopDwordToVMRegsOpcID_SA4
for store EFlags into VM-Registers (intermediate or main).
so we can state, they are VM-opcodes-pairs
*******
so VM-protect hides CPU instruction by dividing single instruction into many VM_opcodes.
But they must fully reproduce these CPU instructions, also care about correct result in EFlags.
For Logical-instructions author builds main VM-instruction "NotNotAnd";
and seems it is logical NOR-gate(!?)
it's asm-code looks so:
Mov eax [ebp+0]
Mov edx [ebp+4]
Not eax
Not edx
And eax edx
Mov [ebp+4] eax
Pushfd
Pop d,[ebp+0]
NotNotAnd (var1, var2) = And (Not var1) (Not var2)
other main logical instuctions Xor, Or, And, Not will emulated by this VM-code.
seems, this secquence produces valid result EFlags for emulated logical instructions,
so no further works need to done on EFlags.
VM_NOT (A) = NotNotAnd (A, A)
PushEBP_SS4 + PushMemDwordSS = push dword[esp]
usually uses in VM_NOT, to prevent duble calculation
VM_AND is fun, because variables needs Not-ed before pass to NotNotAnd, where they restored.
VM_AND (A, B) = NotNotAnd {VM_NOT (A), VM_NOT (B)} = NotNotAnd {NotNotAnd (A, A) , NotNotAnd (B, B)}
VM_TEST = VM_AND ; result value stored in intermediate Vm-regs, discarded
VM_OR (A, B) = VM_NOT [NotNotAnd (A, B)] = NotNotAnd {NotNotAnd (A, B) , <SamePushed }
VM_XOR (A, B) = NotNotAnd {NotNotAnd (A, B)} {VM_AND (A, B)}
=NotNotAnd {NotNotAnd (A, B)} {NotNotAnd [NotNotAnd (A, A) , NotNotAnd (B, B)]}
Rol,Ror,Sar are emulated via Shld & Shrd handlers;
firstly, value will Sign-extended to EAX:EDX, then Shld-ed
VM_RCL and VM_RCR will handled by RclDword_SS2 & RclDword_SS2 handlers,
for which Carry-Flag should extracted from VM_regs_Eflags;
then instuction in handler-code "SHR CH, 1" will load extracted CFlag & do RCL/RCR
VM_ADD is normal Addition
for other Arithmetic-instructions VM uses VM_ADD + NotNotAnd constructions
and VM is forced to calculate also EFlags. (but for decompiling they are useless junk!)
VM_ADC (A, B) = VM_ADD(A, (B+Carry_flag))
VM_SUB (A, B) = VM_NOT [VM_ADD {B, (VM_NOT A)}]
EFlags = [And(0815, VM_ADD-EFlags)] + [And( {Not(0815) }, final-VM_NOT-EFlags)]
(virtualized into 36 VM-bytes)
VM_SBB (A, B) = VM_SUB(A, (B+Carry_flag))
VM_CMP = VM_SUB ; result value stored in intermediate Vm-regs, discarded
VM_NEG (A) = like VM_SUB (0, A) , but constant 0 is already Not-ed
Inc & Dec instructions in CPU not affects Carry-flag, so Carry-flag should leaved in previous state.
VM_INC (A) = VM_ADD(1, A)
Carry-flag restore in EFlags
VM_DEC (A) = VM_ADD(-1, A)
Carry-flag restore in EFlags | Align-Flag managing
......
VM_Conditional_Jump works so:
in stack VM places offset_NoJumpVM_IP & offset_JumpVM_IP,
then pointer to offset_NoJumpVM_IP stack place.
then according to EFlag this pointer will adjust to JumpVM_IP either leaved as is.
example for JZ coditional jump. in stack are pushed:
:: pointer_to_offset_NoJump (PushEBP_SS4)
:: offset_NoJump
:: offset_Jump
after Comparision EFlags are set.
Shr (And 040 EFlags) 4
Add [pointer_to_offset_NoJump] result
if Zf=1, then result = 4 , else 0
..then will load VM_IP from this pointer
......
VMprotect virtualizes also CPU's complex-instructions,
if such can be represented by simple instructions.
VM_SETLE (virtualized into 80 VM-bytes!)
there will EFlags testing AND (080, EFlags) | AND (0800, EFlags) ... AND (040, EFlags) ...
and produced result_byte will copied into destination
VM_CMOVLE
same kind EFlags testing as SETLE, + VM_Conitional_Jump
for example VM_MOVSB
this complex-instruction is re-presented into simple instructions assembly group,
then this group virtualized.
VM_BSWAP is done in following way (27 VM-bytes)
HiWord(result) = HiWord {Shl (LoWord_LoWord) 8}
LoWord(result) = HiWord {Shl (HiWord_HiWord) 8}
VM_XADD
VM does same as CPU
VM_XCHG !?
while VMprotect author cares about LOCK prefix & not virtualizes instruction with it,
author did mistake and virtualized XCHG instruction.. oops!
to prevent XCHG virtualization, author recommends LOCK prefix
......
for FLD, FSTP instructions memory content will copied on stack, & load-store from there.
......
VM-Registers space is 16 dwords. 8 of them are for Eax,Ecx,Edx,Ebx,Ebp,Esi,Edi,EFlags
Esp is directly assigned to VM_stack(Ebp)
other 8 are used for temporal storage. mostly for intermediate EFlags,
also for intermediate or temporal results (VM_TEST, VM_CMP), also for cleanup VM-stack;
look at VM_SUB, where 2-intermediate Eflags will added
Place of real registers in this space is different not only for every other VM,
but also can change inside one VM!
register read from one place, after CAN placed on intermediate place and
old place become intermediate. so VM-Registers tracking need!
......
VM-entry works so:
we are at current stack; lets call it TOP-ESP
at original Opcode place will call to VM:
push offset-VM_IP
call VM-StartCode
,,,,
VM-StartCode:
push Registers, EFlags ; << Order of push CAN be other then order of pop on ExitVM!
push dword[passed_pointer_for_security]
; <<new from 1.8, passed from StartupVM, which does file CRC-check
push 0 ; Relocation-Difference
mov esi, [esp+030] ; offset-VM_IP
mov ebp, esp ; ebp will VM-stack
sub esp, 0C0 ; 040 bytes reserved for 16 VM-Registers, other free 080 byte space will used
; for user-pushed-variables. if too low become VM-stack, then VM-Registers will
; moved down
mov edi, esp ; edi holds VM-Registers pointer
add esi, [ebp+0] ; add Relocation-Difference to offset-VM_IP
and now code is on VM_main_loop:
{VM_main_loop has 2 variations, down-read VM-bytes (as below) or inverse up-read}
mov al,[esi]
movzx eax,al
inc esi
jmp [JumpTable + eax*4]
But VM-entry yet not finished! now will executed Prolog-VM-bytes,
which will move all pushed Registers,EFlags,+others to VM-Registers space until
VM-stack(ebp) will poped to TOP-ESP.
now only starts virtualized user-code execution
......
VM-exit works so:
VM-stack(ebp) is at TOP-ESP; (can be above start value, if Ret_nn emulated or Esp changed)
now VM executes Epilog-VM-bytes, which will pop all required values (return_IP if need)
from VM-Registers_space to stack and Ebp is ready for ExitVM-handler.
at last will VM-byte, calling ExitVM-handler, which pop all from stack:
mov esp, ebp
pop Registers, EFlags ; << Order of pop CAN be other
ret
......
if VM can't virtualize instruction, there is 2 way
1. if instruction not affects Memory-Registers-EFlags (like EMMS, WAIT),
it can included in any free VM-handler and appropriate VM-byte will assigned
2. else VM does full VM-exit to this instruction & after it's execution again calls VM.
......
memory effective address will calculated step by step
example: mov eax D$edi+ecx*8+088888888
all values will pashed & added. on (ecx*8) will used ShlDword_SS2 handler.
......
in these manners will virtualized many instructions, in hope to hide them.
OK, but now Generic Question is: does Complex-Opcode virtualization matters!?!?
for example, lets say, we decompiled VM_MOVSB into simple instuction group
& we not guess their meaning as initial opcode. But, is that problem!? No!
code has all it's functionality anyway!