Kayaker
December 19th, 2007, 04:10
I had started live tracing this piece of malware when I realized it was a prime candidate for some IDA idc scripting. There are a few things about the program which makes static analysis difficult.
I've included several files in the attachment, the idc scripts, a header file prototyping some functions and structures not included in the internal IDA definitions. As well there is as an IDB file in IDA 4.9 freeware version format which is fully commented with all functions and variables defined (or at least named, this is meant to be a "working" disassembly for further analysis, not necessarily a definitive treatise). The IDB file can be opened in any IDA version 4.9 and above.
Also included is of course the virus, or else what fun would this be? The win32_virut.exe file has been renamed with a .VXE extension and zip password protected with the password malware. It is quite an infectious file, but it readily detects a normal virtual machine sandbox and won't infect under those conditions. You actually have to force the code to decrypt its payload under a VM. Also, the remote site it tries to connect to has been closed for Terms Of Agreement violations (ya think?), so no live connection is ever made.
You may find it easier to read the idc scripts by downloading the originals or by reading this post in the Blogs Forum (follow the link under 'Post or View Comments' at the bottom of this blog).
A brief description of the virus family from
http://www.bitdefender.com/VIRUS-1000163-en--Win32.Virtob
Much of what is written above can be figured out by live tracing the malware and eventually letting it infect our sandbox. But let's see what damage we can do to it before it does damage to us..
Step 1: Decrypt
Here is the Entry Point of the virus, which is in the .data section:
Take note that the return address of the Call pushed onto the stack will be 0x403206. Trace into the Call and after a bit of preliminary code we reach here:
The fact that none of the code from address 0x403282 onwards doesn't make much sense indicates that Call sub_403206 is a decryption routine. Let's take a look at that call:
A simple XOR loop decryption where the xor value is modified on each iteration by the XCHG instruction. ECX is a counter decremented by the LOOP opcode. The initial decryption seed value is the db 8Eh, 0C8h we discovered above.
Armed with this small bit of analysis we can create the following idc script for decrypting.
After running this script you MUST go through the decrypted section and manually resolve the embedded string pointers with the IDA A(scii) command and any unrecognized or incorrect disassembly with the C(ode) command. This is a necessary step for the subsequent IDC scripts to work properly!
You will find a lot of things like the following, which you need to make sure is correctly resolved. By itself IDA won't properly disassemble the code.
Notice the neat little trick in the above code of how the second parameter of GetProcAddress is automatically pushed onto the stack by effectively being the return address of Call loc_4032EA, which jumps over the string. This type of thing is repeated throughout the program.
Chances are you won't get every bit of disassembly and ascii string identified correctly the first time through a manual fixup, but after applying the subsequent idc scripts those problem areas should be identified and you can go back and correct them before running the scripts again. You'll find odd things such as wsprintf format strings, non-null terminated string blocks with xrefs to parts of them, call instructions where the offset displacement is dynamically calculated, etc.
Step 2: Resolve EBP offsets to real addresses
You'll notice after decrypting the file that variable pointers and import calls are in the form of [ebp+30xxxx]. We've already determined above that EBP = 0x102200, so we simply need to calculate the real address used and replace the operand.
Rather than just replacing the operand text itself with the calculated real address, say by using the idc command
string AltOp (long ea,long n); // get manually entered operand
we will actually patch in the proper displacement in the hex bytes with
PatchDword (long ea,long value);
After handling each affected instruction we need to undefine it with
MakeUnkn (long ea, long expand);
and have IDA reanalyze with
AnalyzeArea (long sEA,long eEA);
The operands should be converted to a real offset and the proper xrefs resolved for each instruction.
We also use the idc commands
long GetOperandValue (long ea,long n); // get instruction operand value
string GetOpnd (long ea,long n); // get instruction operand
i.e. for the instruction
mov [ebp+302BD5h], eax
GetOpnd (ea,0); returns the string "[ebp+302BD5h]"
GetOperandValue (ea,0); returns 0x00302BD5
After patching in the real address and reanalyzing the code the operand will be rewritten with an "ss:" prefix and/or "[ebp]" suffix.
i.e. the previous example will resolve to:
ss:dword_404DD5[ebp]
We don't really want that so we can remove those string components by parsing them out. That will be the job of the next idc script. That step could be added here, but for demonstration purposes I keep them separate.
AnalyzeArea might not resolve all the instructions properly the first time through, so a second pass is necessary to convert any instructions that are still in the form of [ebp+xxxxxxxx]. This usually occured where we(I) didn't make the proper manual corrections to the disassembly or inline ascii strings after running the decryption idc script. We can use the GetFlags(long ea) command to get the internal flags for the operand definition and deal with each type individually.
Any problem operands remaining will be pointed out the by the idc script, and will also be highlighted in red by IDA. These should be handled manually. For example, the virus may create a dynamically determined call offset or otherwise change an instruction. IDA resolves these as Xrefs into the middle of an instruction, but doesn't quite get the syntax right when running AnalyzeArea through the idc script. However, if you right click on the errant operand you will probably find a more accurate selection.
Enough of the preamble, I just wanted to touch on a few points of using these idc commands.
Here's the second script:
Step 3: Parse out unwanted operand text
Step 4: Resolve API names
Immediately after the code is decrypted by the program it retrieves the offset of GetProcAddress by finding the base of kernel32.dll and parsing through its export table. All other import addresses, including those it hooks from ntdll.dll, are obtained by using GetProcAddress.
The import names it wants are in an ascii table and a simple routine is used for each dll it searches for import addresses.
For example,
ESI is the start of the API name table, which begins here:
EDI is the start of the table where it places the API addresses.
ECX (CL) contains the number of import names to find for this particular dll.
The Call is a simple LOOP which calls GetProcAddress for each import and stores their offsets.
Having cleaned up the disassembly with the first 3 idc scripts we can easily find where this GetProcAddress routine is cross referenced in the file and get the necessary values for each of the 5 dlls in order to resolve API names with the following script:
Step 5: Apply C header file
The last step is to read in the header file, defines.h, with the IDA menu command File/Load file/Parse C header file (Ctrl+F9). This file contains some of the function prototypes and structures not defined by default by IDA, primarily the ntdll imports.
You'll probably notice that the parameter definitions for import calls are not always propagated correctly, some may have them, some may not. There are a few things that may help, that fall into the category of "dealing with IDA quirks".
The included IDB file has most of the virus functionality in the .data section defined in a general way. The .text section has a small decryption function I didn't bother detailing, it is most easily dealt with under a debugger and is completely safe to let run under a closed sandbox environment.
Again, the idc scripts, IDB file and virus are in the attachment, the exe has been renamed .vxe and zip protected with the password malware
Part 2 ("http://www.woodmann.com/forum/showthread.php?t=11075") of this post will follow.
http://www.woodmann.com/forum/attach/zip.gif Win32_Virut_Analysis.zip ("http://www.woodmann.com/forum/blog_attachment.php?attachmentid=3&d=1198051304") (183.2 KB)
Let's address each of these problems through a bit of idc scripting and static analysis to supplement our live tracing. In Part 2 I'll mention a few points about the viral code itself.
It's encrypted. It's a rather simple encryption however which is easily scripted out.
The code is primarily executed in the .data section and there are many inline character strings and non-standard code instructions which prevent IDA from getting an accurate disassembly.
Variable pointers and import calls are referenced as EBP offsets, so IDA can't recognize absolute addresses to create the proper Xrefs, autogeneration and all the other wonderful analysis it normally performs.
Imports are determined dynamically through GetProcAddress, so until we define them IDA can't recognize them.
I've included several files in the attachment, the idc scripts, a header file prototyping some functions and structures not included in the internal IDA definitions. As well there is as an IDB file in IDA 4.9 freeware version format which is fully commented with all functions and variables defined (or at least named, this is meant to be a "working" disassembly for further analysis, not necessarily a definitive treatise). The IDB file can be opened in any IDA version 4.9 and above.
Also included is of course the virus, or else what fun would this be? The win32_virut.exe file has been renamed with a .VXE extension and zip password protected with the password malware. It is quite an infectious file, but it readily detects a normal virtual machine sandbox and won't infect under those conditions. You actually have to force the code to decrypt its payload under a VM. Also, the remote site it tries to connect to has been closed for Terms Of Agreement violations (ya think?), so no live connection is ever made.
You may find it easier to read the idc scripts by downloading the originals or by reading this post in the Blogs Forum (follow the link under 'Post or View Comments' at the bottom of this blog).
A brief description of the virus family from
http://www.bitdefender.com/VIRUS-1000163-en--Win32.Virtob
Quote:
This virus is a polymorphic, memory-resident file-infector, with backdoor behaviour. Once executed, it injects itself into WINLOGON, creates a new thread in that process, and passes the execution control to the host file. It also hooks the following functions in each running process (in NTDLL module): NtCreateFile, NtOpenFile, NtCreateProcess, NtCreateProcessEx so that every time an infected process calls one of these functions, the execution is passed to the virus, which infects the accessed file, and then returns the control to the original function. It infects EXE and SCR files, using different infection techniques: Appending to the last section of the victim, and setting the Entry Point directly to the viral code. (our variant) The virus is able to avoid emulators and virtual machines. To ensure there's only one instance of it running in the system, it creates an event with one of the following names: VT_3, VT_4, VevT, Vx_4 It tries to connect to some IRC server, and join a certain channel. Once it joins the channel, it waits for commands that instruct it to download several files from Internet, and then execute them. The IRC server can be: proxim.ntkrnlpa.info (our variant, site no longer active) |
Step 1: Decrypt
Here is the Entry Point of the virus, which is in the .data section:
Code:
:00403200 cld
:00403201 call loc_40322E
:00403206 push ebx
Code:
:00403257 mov ebp, [esp+4]
// call return of 0x403206 placed in ebp
:0040325B sub dword ptr [esp+4], 21E9h
// new return address of (0x403206 - 0x21E9) = 0x40101D placed on stack
...
:0040326A sub ebp, 301006h
// ebp offset becomes (0x403206 - 0x301006) = 0x102200
:00403270 lea eax, [ebp+301082h]
// eax = (0x102200 + 0x301082) = 0x403282
// this is the starting address of the encrypted code
:00403276 mov dx, [eax-65h]
// word pointer at 0x40321D is a decryption seed value: db 8Eh, 0C8h
:0040327D call sub_403206 // Decryption routine
// the code from here on down is all encrypted
:00403282 db 65h
:00403282 enter 0BDDh, 0C1h
:00403287 push ss
Code:
:00403206 Decrypt proc near ; CODE XREF: :0040327D
:00403206 push ebx
:00403207 mov ecx, 0DA5h
:0040320C mov ebx, edx
:0040320E
:0040320E loc_40320E: ; CODE XREF: Decrypt+13
:0040320E xor [eax], dx
:00403211 lea eax, [eax+2]
:00403214 xchg dl, dh
:00403216 lea edx, [ebx+edx]
:00403219 loop loc_40320E
:0040321B pop ebx
:0040321C retn
:0040321C Decrypt endp
:0040321C
:0040321C ; ---------------------------------------------------------------
:0040321D Initial_Decrypt_Seed db 8Eh, 0C8h
:0040321F; ----------------------------------------------------------------
Armed with this small bit of analysis we can create the following idc script for decrypting.
Code:
#include <idc.idc>
// Step 1: idc to decrypt section between .data:0x403282 and .data:0x404DCC
// performs the equivalent asm function (xchg dl, dh)
#define bswap16(x) \
((((x) & 0xff00) >> 8) |
(((x) & 0x00ff) << 8))
static main()
{
auto startdecrypt, size, enddecrypt, seed, ea, decryptword, x;
// starting values determined from decrypt function
startdecrypt = 0x403282;
size = 0x0DA5 * 2; // word size replacement
enddecrypt = (startdecrypt + size); // = 0x404DCC
seed = 0xC88E;
ea = startdecrypt;
decryptword = seed;
Message("\nDecrypting... \n");
while (ea < enddecrypt)
{
// (xor [eax], dx)
x = Word(ea); // fetch the word
x = (x ^ decryptword); // decrypt it
PatchWord(ea, x); // put it back
decryptword = bswap16(decryptword); // xchg dl, dh
decryptword = decryptword + seed; // lea edx, [ebx+edx]
ea = ea + 2;
}
// Let's try to get IDA to reanalyze the code
MakeUnknown (startdecrypt, size, 1);
AnalyzeArea (startdecrypt, enddecrypt);
Message("...Done \n");
}
After running this script you MUST go through the decrypted section and manually resolve the embedded string pointers with the IDA A(scii) command and any unrecognized or incorrect disassembly with the C(ode) command. This is a necessary step for the subsequent IDC scripts to work properly!
You will find a lot of things like the following, which you need to make sure is correctly resolved. By itself IDA won't properly disassemble the code.
Code:
:004032D8 E8 0D call loc_4032EA
:004032D8 ; --------------------------------------------
:004032DD 47 65+ aGetlasterror db 'GetLastError',0 // LPCSTR lpProcName
:004032EA ; --------------------------------------------
:004032EA
:004032EA loc_4032EA: ; CODE XREF: :004032D8
:004032EA 03 F3 add esi, ebx
:004032EC 53 push ebx // HMODULE hModule
:004032ED FF D6 call esi // GetProcAddress
Chances are you won't get every bit of disassembly and ascii string identified correctly the first time through a manual fixup, but after applying the subsequent idc scripts those problem areas should be identified and you can go back and correct them before running the scripts again. You'll find odd things such as wsprintf format strings, non-null terminated string blocks with xrefs to parts of them, call instructions where the offset displacement is dynamically calculated, etc.
Step 2: Resolve EBP offsets to real addresses
You'll notice after decrypting the file that variable pointers and import calls are in the form of [ebp+30xxxx]. We've already determined above that EBP = 0x102200, so we simply need to calculate the real address used and replace the operand.
Rather than just replacing the operand text itself with the calculated real address, say by using the idc command
string AltOp (long ea,long n); // get manually entered operand
we will actually patch in the proper displacement in the hex bytes with
PatchDword (long ea,long value);
After handling each affected instruction we need to undefine it with
MakeUnkn (long ea, long expand);
and have IDA reanalyze with
AnalyzeArea (long sEA,long eEA);
The operands should be converted to a real offset and the proper xrefs resolved for each instruction.
We also use the idc commands
long GetOperandValue (long ea,long n); // get instruction operand value
string GetOpnd (long ea,long n); // get instruction operand
i.e. for the instruction
mov [ebp+302BD5h], eax
GetOpnd (ea,0); returns the string "[ebp+302BD5h]"
GetOperandValue (ea,0); returns 0x00302BD5
After patching in the real address and reanalyzing the code the operand will be rewritten with an "ss:" prefix and/or "[ebp]" suffix.
i.e. the previous example will resolve to:
ss:dword_404DD5[ebp]
We don't really want that so we can remove those string components by parsing them out. That will be the job of the next idc script. That step could be added here, but for demonstration purposes I keep them separate.
AnalyzeArea might not resolve all the instructions properly the first time through, so a second pass is necessary to convert any instructions that are still in the form of [ebp+xxxxxxxx]. This usually occured where we(I) didn't make the proper manual corrections to the disassembly or inline ascii strings after running the decryption idc script. We can use the GetFlags(long ea) command to get the internal flags for the operand definition and deal with each type individually.
Any problem operands remaining will be pointed out the by the idc script, and will also be highlighted in red by IDA. These should be handled manually. For example, the virus may create a dynamically determined call offset or otherwise change an instruction. IDA resolves these as Xrefs into the middle of an instruction, but doesn't quite get the syntax right when running AnalyzeArea through the idc script. However, if you right click on the errant operand you will probably find a more accurate selection.
Enough of the preamble, I just wanted to touch on a few points of using these idc commands.
Here's the second script:
Code:
#include <idc.idc>
// Step 2: idc to resolve EBP offsets to real addresses
static resolve_offsets(ea, n)
{
auto OpVal, realaddress, patchoffset, i;
OpVal = GetOperandValue(ea, n);
if (OpVal > 0x400000)
{
return; // we've already converted this operand
}
// calculate the real address
realaddress = GetOperandValue(ea, n) + 0x102200;
// calculate the offset where the operand begins in the instruction
for (i = 0; i < ItemSize(ea) - 3; i++)
{
if (Dword(ea + i) == OpVal)
{
// Pattern found
patchoffset = (ea + i);
}
}
// patch in the real displacement
PatchDword(patchoffset, realaddress);
// undefine the instruction so it will be reanalyzed fresh later
MakeUnkn (ea, 0);
}
static main()
{
auto startea, endea, ea, n, nextea, OpVal, uFlags, count1, count2, count3;
startea = 0x403270; // first occurence of [ebp+30xxxx] offset
endea = 0x404DCC; // determined from idc in Step 1
// Use some counters to check that all operands were handled properly.
// Remaining errors likely mean we didn't make the correct analysis
// after running the decrypt script in Step 1.
// Go back, correct those instructions and rerun this script.
count1 = 0;
count2 = 0;
count3 = 0;
/////////////////////////////////////////////////////////////////////////
// Step 1:
// Convert operands of the form "[ebp+30xxxxh]" to a real offset
/////////////////////////////////////////////////////////////////////////
ea = startea;
Message("\nConverting EBP offset operands to real addresses \n");
while (ea != BADADDR)
{
// calculate next instruction pointer before we modify anything
nextea = NextHead(ea, endea);
// check both the first(0) and second(1) operand of the instruction
for (n=0; n<2; n++)
{
// for all instructions with an offset in the form of "[ebp+"
if( strstr( GetOpnd (ea, n), "[ebp+" ) != -1 )
{
count1 = count1 + 1;
resolve_offsets(ea, n);
}
}
ea = nextea; // next instruction
}
// Reanalyze
AnalyzeArea (startea, endea);
/////////////////////////////////////////////////////////////////////////
// Step 2:
// Make a second pass at autoanalysing operands
// still in the form of "[ebp+"
/////////////////////////////////////////////////////////////////////////
ea = startea;
Message("Running a second pass at autoanalysis \n");
while (ea != BADADDR)
{
nextea = NextHead(ea, endea);
for (n=0; n<2; n++)
{
// for all instructions with an offset in the form of "[ebp+"
if( strstr( GetOpnd (ea, n), "[ebp+" ) != -1 )
{
count2 = count2 + 1;
// Get operand value
OpVal = GetOperandValue(ea, n);
// Get value of internal flags to see how IDA
// has defined the operand to this point
uFlags = GetFlags(OpVal);
if(isData(uFlags))
{
// If operand offset is already defined as 'data'
// then we only need to reanalyze the instruction
// to get IDA to resolve the xref
// undefine the instruction so it will be reanalyzed fresh
MakeUnkn (ea, 0);
} else
if(isUnknown(uFlags))
{
// If operand offset is defined as 'unknown', create
// a data xref at the operand address and reanalyze
add_dref(ea, OpVal, XREF_USER | dr_O);
MakeUnkn (ea, 0);
} else {
// GetFlags(OpVal) indicates that what is left over is
// defined as 'isTail'. Undefine both the operand address
// and the calling instruction and let IDA reanalyze
MakeUnkn (OpVal, 0);
MakeUnkn (ea, 0);
}
}
}
ea = nextea;
}
// Reanalyze
AnalyzeArea (startea, endea);
/////////////////////////////////////////////////////////////////////////
// Step 3:
// Finally, let's inform ourselves of which instructions are still
// in the form of "[ebp+" and should be checked manually.
// The offsets will be highlighted in red by IDA as well.
/////////////////////////////////////////////////////////////////////////
ea = startea;
Message("The following instructions (if any) are still in error and \
should be fixed manually before rerunning this script \n");
while (ea != BADADDR)
{
nextea = NextHead(ea, endea);
for (n=0; n<2; n++)
{
// for all instructions with offset *still*
// in the form of "[ebp+"
if( strstr( GetOpnd (ea, n), "[ebp+" ) != -1 )
{
count3 = count3 + 1;
Message("%d 0x%08X %s \n", count3, ea, GetOpnd (ea, n));
}
}
ea = nextea;
}
Message("\n%d / %d operands analysed correctly on first pass \n",
count1-count2, count1);
Message("%d / %d operands corrected on second pass \n",
count2-count3, count1);
/////////////////////////////////////////////////////////////////////////
Message("...Done \n");
}
Step 3: Parse out unwanted operand text
Code:
#include <idc.idc>
// Step 3: idc to parse out unwanted text
// from an operand such as "ss:dword_404DD5[ebp]"
static clean_text(ea, n)
{
auto OldOpStr, TempOpStr, NewOpStr, pos, beforestr, afterstr;
beforestr = 0;
afterstr = 0;
OldOpStr = GetOpnd (ea, n);
// find position of "ss:" if present and remove it
pos = strstr(OldOpStr, "ss:");
if(pos != -1) // contains substring
{
beforestr = substr(OldOpStr, 0, pos);
afterstr = substr(OldOpStr, pos+3, -1);
// combine string parts without "ss:"
TempOpStr = beforestr + afterstr;
} else {
TempOpStr = OldOpStr;
}
// find position of "[ebp]" if present and remove it
pos = strstr(TempOpStr, "[ebp]");
if(pos != -1)
{
beforestr = substr(TempOpStr, 0, pos);
afterstr = substr(TempOpStr, pos+5, -1);
// combine string parts without "[ebp]"
NewOpStr = beforestr + afterstr;
OpAlt(ea, n, NewOpStr); // replace the operand
}
}
static main()
{
auto startea, endea, ea, n;
startea = 0x403270; // first occurence of [ebp+30xxxx] offset
endea = 0x404DCC; // determined from idc in Step 1
ea = startea;
Message("\nCleaning up operand syntax... \n");
while (ea != BADADDR)
{
// check both the first(0) and second(1) operand of the instruction
for (n=0; n<2; n++)
{
// for all instructions where we find "ss:" or "[ebp]"
if( strstr( GetOpnd (ea, n), "ss:" ) != -1 ||
strstr( GetOpnd (ea, n), "[ebp]" ) != -1 )
{
clean_text(ea, n);
}
}
ea = NextHead(ea, endea); // next instruction
}
Message("...Done \n");
}
Step 4: Resolve API names
Immediately after the code is decrypted by the program it retrieves the offset of GetProcAddress by finding the base of kernel32.dll and parsing through its export table. All other import addresses, including those it hooks from ntdll.dll, are obtained by using GetProcAddress.
The import names it wants are in an ascii table and a simple routine is used for each dll it searches for import addresses.
For example,
Code:
:00403369 lea ESI, aLstrcat ; "lstrcat"
:0040336F xor ecx, ecx
:00403371 lea EDI, dword_404DE9
:00403377 mov CL, 24h
:00403379 call GetProcAddress_Routine
Code:
:004037B3 aLstrcat db 'lstrcat',0 ; DATA XREF: :00403369t
:004037BB aLstrlen db 'lstrlen',0
:004037C3 aCreatefilea db 'CreateFileA',0
:004037CF aCreatefilemapp db 'CreateFileMappingA',0
...
ECX (CL) contains the number of import names to find for this particular dll.
The Call is a simple LOOP which calls GetProcAddress for each import and stores their offsets.
Having cleaned up the disassembly with the first 3 idc scripts we can easily find where this GetProcAddress routine is cross referenced in the file and get the necessary values for each of the 5 dlls in order to resolve API names with the following script:
Code:
#include <idc.idc>
// Step 4: idc to resolve import calls and enter their name
static patchapi(apinametable, apiaddresstable, numapis)
{
while (numapis != 0)
{
if (!MakeNameEx(apiaddresstable,GetString(apinametable, -1,
ASCSTR_C),SN_AUTO))
{
// we will get an error because LoadLibraryA is already defined
// rename as LoadLibraryA_0
Message("API name already in use, renaming as %s \n",
GetString(apinametable, -1, ASCSTR_C)+"_0");
MakeNameEx(apiaddresstable,
GetString(apinametable, -1, ASCSTR_C)+"_0",SN_AUTO);
}
apinametable = NextHead(apinametable, BADADDR);
apiaddresstable = apiaddresstable+4;
numapis = numapis - 1;
}
}
static main()
{
Message("\nResolving API names... \n");
patchapi(0x4037B3, 0x404DE9, 0x24);
patchapi(0x4039BE, 0x404E79, 0x0D);
patchapi(0x403B5F, 0x404EDD, 0x04);
patchapi(0x403AB6, 0x404EAD, 0x07);
patchapi(0x403AF4, 0x404EC9, 0x05);
Message("Game over \n");
}
Step 5: Apply C header file
The last step is to read in the header file, defines.h, with the IDA menu command File/Load file/Parse C header file (Ctrl+F9). This file contains some of the function prototypes and structures not defined by default by IDA, primarily the ntdll imports.
You'll probably notice that the parameter definitions for import calls are not always propagated correctly, some may have them, some may not. There are a few things that may help, that fall into the category of "dealing with IDA quirks".
Once all this "prettying up" of the disassembly is done you can finally get to the fun part of analyzing the program.
Make sure the code containing the import(s) is within a defined function (Create function).
Make sure the function has a proper endpoint, i.e. some of the virus function blocks may end with a JMP (Set function end).
Select (Edit function). Don't make any changes, just close the dialog box. This seems to force IDA to reanalyze the function and often redefine and propagate any import parameters correctly.
Right click on the import function, undefine and then redefine as Code. Again, this seems to work for some cases.
The included IDB file has most of the virus functionality in the .data section defined in a general way. The .text section has a small decryption function I didn't bother detailing, it is most easily dealt with under a debugger and is completely safe to let run under a closed sandbox environment.
Again, the idc scripts, IDB file and virus are in the attachment, the exe has been renamed .vxe and zip protected with the password malware
Part 2 ("http://www.woodmann.com/forum/showthread.php?t=11075") of this post will follow.
http://www.woodmann.com/forum/attach/zip.gif Win32_Virut_Analysis.zip ("http://www.woodmann.com/forum/blog_attachment.php?attachmentid=3&d=1198051304") (183.2 KB)