Breaking the Windows Script Encoder

by Mr. Brownstone

The Windows Script Encoder (SCRENC.EXE) is a Microsoft tool that can be used to encode your scripts (i.e., JavaScript, ASP pages, VBScript).

Yes... encode, not encrypt.

The use of this tool is to be able to prevent people from looking at, or modifying, your scripts.  Microsoft recommends using the Script Encoder to obfuscate your ASP pages, so in case your server is compromised the hacker would be unable to find out how your ASP applications work.

You can download the Windows Script Encoder v1.0 at: www.microsoft.com/downloads/details.aspx?FamilyID=e7877f67-c447-4873-b1b0-21f0626a6329&displaylang=en

The documentation already says the following:

"Note that this encoding only prevents casual viewing of your code; it will not prevent the determined hacker from seeing what you've done and how."

(By the way, because of this text, I did not deem it necessary to inform Microsoft of this article).

Also, an encoded script is protected against tampering and modifications:

"After encoding, if you change even one character in the encoded text, the integrity of the entire script is lost and it can no longer be used."

So we can make the following observations:

  1. We are a "determined hacker."  *grins*
  2. If it's about "preventing casual viewing," what's wrong with encoding mechanisms like a simple XOR or even uuencoding, Base64, and URL-encoding?
  3. Anyone using this tool will be convinced that it's safe to hard-code all usernames, passwords, and "secret" algorithms into their ASP-pages.  And any "determined hacker" will be able to get to them anyway.

Okay.  So even Microsoft says this can be broken.  Can't be difficult then.  It wasn't.  Writing this article took me at least twice the time I needed for breaking it.  But I think this can be a very nice exercise for anyone who wants to learn more about analyzing codes like this, with known plaintext, known ciphertext, and unknown key and algorithm.

(Actually, a COM object that can do the encoding is shipped with IE 5.0, so reverse engineering this will reveal the algorithm, but that's no fun, is it?)

So, How Does This Work?

The Script Encoder works in a very simple way.  It takes two parameters: the filename of the file containing the script, and the name of the output file, containing the encoded script.

What part of the file will be encoded depends on the filename extension, as well as on the presence of a so-called "encoding marker."  This encoding marker allows you to exclude part of your script from being encoded.  This can be very handy for JavaScript, because the encoded scripts will only work on Microsoft IE 5.0 or higher.... (of course this is not an issue for ASP and VBScripts that run on a web server!).

Say, you've got this HTML page with a script you want to hide from prying eyes:

<HTML>
 <HEAD>
  <TITLE>Page with secret information</TITLE>
   <SCRIPT LANGUAGE="JScript">
    <!--//
    //**Start Encode**
    alert ("this code should be kept secret!!!!");
    //-->
   </SCRIPT>
  </HEAD>
 <BODY>
   This page contains secret information.
 </BODY>
</HTML>

This is what it looks like after running Windows Script Encoder:

<HTML>
 <HEAD>
  <TITLE>Page with secret information</TITLE>
   <SCRIPT LANGUAGE="JScript.Encode">
    <!--//
    //**Start Encode**
    #@~^QwAAAA==@#@&P~,l^+DDPvEY4kdP1W[n,/tK;V9P4~V+aY,/nm.nD"Z"eE#p@#@&&JOO@*@#@&qhAAAA==^#~@&
   </SCRIPT>
  </HEAD>
 <BODY>
   This page contains secret information.
 </BODY>
</HTML>

As you can see, the <SCRIPT LANGUAGE="..."> has been changed into JScript.Encode.

The Script Encoder uses the Scripting.Encoder COM object to do the actual encoding.  The decoding will be done by the script interpreter itself (so we cannot simply call a Scripting.Decoder, because that doesn't exist).

Okay, Let's Play!

Plaintext                      Encoded

Hoi                            #@~^FQAAAA==@#@&CGb@#@&zz O@*@#@&WwIAAA==^#~@
Hai                            #@~^FQAAAA==@#@&CCb@#@&zz O@*@#@&TQIAAA==^#~@
HaiHaiHaiHai                   #@~^IgAAAA==@#@&CCbCmk@#@&CmrCmk@#@&JzRR@*@#@&mgUAAA==^#~@

Cute.

As you can see, @#@& appears to be a newline (@# = Carriage Return, @& = Line Feed), and the position of a character does (sometimes...) matter.  (The first time HaiHai becomes CCbCmk and the second time it's CmrCmk).

Let's just encode a line with a lot of A:

//**Start Encode**
#@~^lgAAAA==@#@&b)zbzbbzbz)bzb)bzb))zbbz)bzbbz))bzbzb)b))zb)bz)bzb))zbb))zb)bz)zb)zbzbbzbz)bzb)bzb))zbbz)bzbbz))bzbzb)b))zb)bz)bzb))zbb))zb)bz)zb)zb@#@&zJO @*@#@&vyIAAA==^#~@

The Algorithm

After staring at this for some time, I discovered that the RED part was repeating (actually, the entire string is repeating itself after 64 characters).

Also, it seems to be that the character A has three different representations: b, z, and )

If you encode a string of B you'll see the same pattern, but with different characters.

This means the encoding will look something like this:

int pick_encoding[64] = {....};
int lookuptable[96][3] = {....};

char encode_char (char c, int pos)
{
  if (!specialchar (c))
    return lookuptable [c - 32][pick_encoding[pos % 64]];
  else
    return escapedchar (c);
}

I assumed that only the ASCII codes 32 to 126 inclusive, and 9 (Tab) are encoded.  The rest is being escaped in a similar fashion as CR and LF.

What's left is the stuff before and after the encoded string.  I did not look into this (yet).  It will probably contain a checksum and some information about the length of the encoded script.

The Encoding Tables

So now we'll have to find out those tables for the encoding.  The pick_encoding table is very simple to discover by just looking at the pattern that was the result of encoding all those A:

int pick_encoding[64] = 
{
   1, 2, 0, 1, 2, 0, 2, 0, 0, 2, 0, 2, 1, 0, 2, 0, 
   1, 0, 2, 0, 1, 1, 2, 0, 0, 2, 1, 0, 2, 0, 0, 2, 
   1, 1, 0, 2, 0, 2, 0, 1, 0, 1, 1, 2, 0, 1, 0, 2, 
   1, 0, 2, 0, 1, 1, 2, 0, 0, 1, 1, 2, 0, 1, 0, 2
};

The string of A had a CR and LF in front of them, so after skipping the first two digits, you'll see that 0, 1, 2, 0, 2, 0, 0, 2 perfectly matches b, ), z, b, z, b, b, z, having b = 0, ) = 1, and z = 2.

The other table is a matrix that holds three different representations for each character.  Which one will be used, depends on the pick_encoding table.  To find out this matrix, just make a file that will cause every character to be encoded three times.  Make sure the algorithm is "reset" by padding the lines so each group will start on a 64-byte boundary:

   aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
!!!aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
"""aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
###aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
$$$aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

Etcetera.  Note that there is only 59 bytes of padding a because the CR and LF at the end of the line are counting too!  (59 + 2 + 3 = 64).

After encoding this, you can remove the encoded a again, as well as the @#@& for the CR and LF.  This is what remains:

d7i P~, "Ze JEr a:[ ^yf ]Yu ['L BvE `cv #b* eMC _Q3 ~SB OR  R c z&J !TZ Fq8  +y &f2 c*W *Xl 
v+ G{F %0R ,1O )l= iIp @!@!@! 'x{ @*@*@* g_Q @$@$@$ b)z A$~ Z/; f9G 23A sow M!V Cu_ q(& 9Bx |Fn SJd H\t 1Hg r6} nKh p}5 I]" ?j
U KP: ji` .#j   q (po 5eI }t\ $,] -w' TDY 7?% {m| =|# lCm 48( m^1 N[9 +n 0W6 oLT t44 krb L%N 3V0 Vs^ :hs xU      WGK w2a ;5$ D
.M /dk YOD E;! \-7 hAS 6aX XzH y".      `P uk- 8N) U=? 

So what is this?

It's the encoded representation of the ASCII characters 9, and 32 through 126.  Every character has got three different representations, so this sums up to 3 * (127 - 32 + 1) = 288 characters.

You'll see that the <, >, and @ characters are escaped too, resulting in the following table:

Esc     Org

@#      \r
@&      \n
@!      <
@*      >
@$      @

I've removed the @!, @*, and @$ from the encoded text too and replaced them with question marks, so the table will stay nice.  This is what you get as a hex dump:

unsigned char encoding[288] = {
        0x64,0x37,0x69, 0x50,0x7E,0x2C, 0x22,0x5A,0x65, 0x4A,0x45,0x72, 
        0x61,0x3A,0x5B, 0x5E,0x79,0x66, 0x5D,0x59,0x75, 0x5B,0x27,0x4C, 
        0x42,0x76,0x45, 0x60,0x63,0x76, 0x23,0x62,0x2A, 0x65,0x4D,0x43, 
        0x5F,0x51,0x33, 0x7E,0x53,0x42, 0x4F,0x52,0x20, 0x52,0x20,0x63, 
        0x7A,0x26,0x4A, 0x21,0x54,0x5A, 0x46,0x71,0x38, 0x20,0x2B,0x79, 
        0x26,0x66,0x32, 0x63,0x2A,0x57, 0x2A,0x58,0x6C, 0x76,0x7F,0x2B, 
        0x47,0x7B,0x46, 0x25,0x30,0x52, 0x2C,0x31,0x4F, 0x29,0x6C,0x3D, 
        0x69,0x49,0x70, 0x3F,0x3F,0x3F, 0x27,0x78,0x7B, 0x3F,0x3F,0x3F, 
        0x67,0x5F,0x51, 0x3F,0x3F,0x3F, 0x62,0x29,0x7A, 0x41,0x24,0x7E, 
        0x5A,0x2F,0x3B, 0x66,0x39,0x47, 0x32,0x33,0x41, 0x73,0x6F,0x77, 
        0x4D,0x21,0x56, 0x43,0x75,0x5F, 0x71,0x28,0x26, 0x39,0x42,0x78, 
        0x7C,0x46,0x6E, 0x53,0x4A,0x64, 0x48,0x5C,0x74, 0x31,0x48,0x67, 
        0x72,0x36,0x7D, 0x6E,0x4B,0x68, 0x70,0x7D,0x35, 0x49,0x5D,0x22, 
        0x3F,0x6A,0x55, 0x4B,0x50,0x3A, 0x6A,0x69,0x60, 0x2E,0x23,0x6A, 
        0x7F,0x09,0x71, 0x28,0x70,0x6F, 0x35,0x65,0x49, 0x7D,0x74,0x5C, 
        0x24,0x2C,0x5D, 0x2D,0x77,0x27, 0x54,0x44,0x59, 0x37,0x3F,0x25, 
        0x7B,0x6D,0x7C, 0x3D,0x7C,0x23, 0x6C,0x43,0x6D, 0x34,0x38,0x28, 
        0x6D,0x5E,0x31, 0x4E,0x5B,0x39, 0x2B,0x6E,0x7F, 0x30,0x57,0x36, 
        0x6F,0x4C,0x54, 0x74,0x34,0x34, 0x6B,0x72,0x62, 0x4C,0x25,0x4E, 
        0x33,0x56,0x30, 0x56,0x73,0x5E, 0x3A,0x68,0x73, 0x78,0x55,0x09, 
        0x57,0x47,0x4B, 0x77,0x32,0x61, 0x3B,0x35,0x24, 0x44,0x2E,0x4D, 
        0x2F,0x64,0x6B, 0x59,0x4F,0x44, 0x45,0x3B,0x21, 0x5C,0x2D,0x37, 
        0x68,0x41,0x53, 0x36,0x61,0x58, 0x58,0x7A,0x48, 0x79,0x22,0x2E, 
        0x09,0x60,0x50, 0x75,0x6B,0x2D, 0x38,0x4E,0x29, 0x55,0x3D,0x3F 
};

So, encoding character c at position i goes as follows:

  1. Look up which representation to use (the first, second or third): pick_encoding[i mod 64]
  2. Find the representations in the huge table: encoding[c * 3]
  3. Encoded character = encoding[c * 3 + pick_encoding[i % 64]];

Because the table starts at 9 and then goes to 32, you'll have to do some corrections.  But we'll get to that later, as we are not really interested in encoding after all.  We want to be able to do some decoding!

The Decoding Tables

The pick_encoding table will stay the same.  This is because each character (except for the escaped ones, of course) will be in the same place as the original.  Then, we could just look up the encoded character in the table.  For instance, an A in encoded text (hex 0x41), occurs on these places in the "encoding" table:

  • Row 9, group 4, representation 1 = F
  • Row 10, group 3, representation 3 = I
  • Row 23, group 1, representation 2 = {

So an A in the encoded text is an F, I, or {, depending on it's position.

Where there is a 0 in the pick_encoding table, it's an F.

For 1 it's an I.

For 2 it's a {.

You don't want to go looking through the encoding table each time, trying to find those numbers.  By transforming the encoding table into another table, you can just go to position 0x41 (actually, 0x41 - 31 to correct it skipping everything below space except for Tab), and pick the correct representation:

unsigned char transformed[3][126];

void maketrans (void)
{
  int i, j;
  for (i = 31; i <= 126; i++)
    for (j = 0; j < 3; j++)
      transformed[j][encoding[(i - 31) * 3 + j]] = (i == 31) ? 9 : i;
}

With this matrix, it's very simple to look up the original character by simply looking it up in our table.

Assume i is the position of the character and c is the character again.  Then: decoded = transformed[pick_encoding[i % 64]][c];

The Encoding of the Length-Field

So what's left is to find out how many characters there are to decode.

If we just keep decoding stuff, we will decode part of the HTML that's behind the encoded script.  This can be avoided by stopping when a < is encountered (< will never appear in an encoded stream), but even in the case we are looking at a "pure" script file (*.js or *.vbs), there is some checksum stuff behind the actual data, which we should not decode.

I created a number of files of different size.  By giving them a *.js extension the entire file is encoded without the Script Encoder looking for a start marker.

The results are below (only the first 12-bytes are displayed):

Length    First 12-Bytes                        ASCII

1         23 40 7E 5E 41 51 41 41-41 41 3D 3D   #@^EQAAAA==
2         23 40 7E 5E 41 67 41 41-41 41 3D 3D   #@^EgAAAA==
3         23 40 7E 5E 41 77 41 41-41 41 3D 3D   #@^EwAAAA==
4         23 40 7E 5E 42 41 41 41-41 41 3D 3D   #@^FAAAAA==
5         23 40 7E 5E 42 51 41 41-41 41 3D 3D   #@^FQAAAA==
6         23 40 7E 5E 42 67 41 41-41 41 3D 3D   #@^FgAAAA==
7         23 40 7E 5E 42 77 41 41-41 41 3D 3D   #@^FwAAAA==
8         23 40 7E 5E 43 41 41 41-41 41 3D 3D   #@^GAAAAA==
9         23 40 7E 5E 43 51 41 41-41 41 3D 3D   #@^GQAAAA==
32        23 40 7E 5E 49 41 41 41-41 41 3D 3D   #@^IAAAAA==
48        23 40 7E 5E 4D 41 41 41-41 41 3D 3D   #@^MAAAAA==
80        23 40 7E 5E 55 41 41 41-41 41 3D 3D   #@^UAAAAA==
96        23 40 7E 5E 59 41 41 41-41 41 3D 3D   #@^YAAAAA==
103       23 40 7E 5E 5A 77 41 41-41 41 3D 3D   #@^ZwAAAA==
104       23 40 7E 5E 61 41 41 41-41 41 3D 3D   #@^aAAAAA==
111       23 40 7E 5E 62 77 41 41-41 41 3D 3D   #@^bwAAAA==
116       23 40 7E 5E 64 41 41 41-41 41 3D 3D   #@^dAAAAA==
166       23 40 7E 5E 70 67 41 41-41 41 3D 3D   #@^pgAAAA==
216       23 40 7E 5E 32 41 41 41-41 41 3D 3D   #@^2AAAAA==
265       23 40 7E 5E 43 51 45 41-41 41 3D 3D   #@^CQEAAA==
451       23 40 7E 5E 77 77 45 41-41 41 3D 3D   #@^wwEAAA==

The length seems to be encoded in the 5th to 10th byte, and 41 appears to be representing zero.

The first byte of the length seems to be increasing with one when the length increases with 4.  Also, the second byte alternates between 41, 51, 67, and 77.

If you look at length 166, this value is 0x70, where it should be: 0x41 + (166 / 4) = 0x6A

So something goes wrong, and it can be narrowed down to length 104, where it suddenly jumps from 0x5A to 0x61.

This puzzled me for a long time, until I realized that: 0x5A = Z and 0x61 = a

And yes, the length turns out to be Base64 encoded indeed.  :)

The Checksum

At the end of the encoded data is apparently some kind of checksum.  I did not look into this any further.

The decoder program

The further working of the decoder program, which can be downloaded from the scrdec home page, is left as an exercise to the reader.

It's implemented as a "Turing-like" state machine.  The decoder will treat .js and .vbs files as fully encoded, while .htm(l) and .asp files are seen as files that contain script amongst other things - like HTML code.

The decoder simply takes two arguments: input filename (encoded), and output filename (decoded).

There is one thing lacking in the decoder: the value of the <SCRIPT LANGUAGE="..."> attribute, is not changed back into the original form.  You'd better use a tool like sed for that.

Conclusion

It's not just sad that Microsoft made a tool like this.  They've probably asked Bill Gates' little nephew to write this code.  The really bad part is that Microsoft actually recommends people to use this piece of crap, and because of that, people will rely on it, even though the documentation hints that it's unsafe.  (Nobody reads the docs anyway...)

Security by obscurity is a bad, bad idea.  Instead of encouraging that approach, Microsoft should educate programmers to find other ways to store their passwords and sensitive data, and tell them that an algorithm or any other piece of code that needs to be "hidden," is just bad design.

scrdec18.c:

/**********************************************************************/
/* scrdec18.c - Decoder for Microsoft Script Encoder                    */
/* Version 1.8                                                        */
/*                                                                    */
/* COPYRIGHT:                                                         */
/* (c)2000-2005 MrBrownstone, mrbrownstone@ virtualconspiracy.com     */
/* v1.8 Now correctly decodes characters 0x00-0x1F, thanks to 'Zed'   */
/* v1.7 Bypassed new HTMLGuardian protection and added -dumb switch   */
/*       to disable this                                              */
/* v1.6 Added HTML Decode option (-htmldec)                           */
/* v1.5 Bypassed a cleaver trick defeating this tool                  */
/* v1.4 Some changes by Joe Steele to correct minor stuff             */
/*                                                                    */
/* DISCLAIMER:                                                        */
/* This program is for demonstrative and educational purposes only.   */
/* Use of this program is at your own risk. The author cannot be held */
/* responsible if any laws are broken by use of this program.         */
/*                                                                    */
/* If you use or distribute this code, this message should be held    */
/* intact. Also, any program based upon this code should display the  */
/* copyright message and the disclaimer.                              */
/**********************************************************************/

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define LEN_OUTBUF 64
#define LEN_INBUF 1024

#define STATE_INIT_COPY		100
#define STATE_COPY_INPUT	101
#define STATE_SKIP_ML		102
#define STATE_CHECKSUM		103
#define STATE_READLEN		104
#define STATE_DECODE		105
#define STATE_UNESCAPE		106
#define STATE_FLUSHING		107
#define STATE_DBCS			108
#define STATE_INIT_READLEN	109
#define STATE_URLENCODE_1	110
#define STATE_URLENCODE_2	111
#define STATE_WAIT_FOR_CLOSE 112
#define STATE_WAIT_FOR_OPEN 113
#define STATE_HTMLENCODE	114

unsigned char rawData[292] = {
        0x64,0x37,0x69, 0x50,0x7E,0x2C, 0x22,0x5A,0x65, 0x4A,0x45,0x72, 
        0x61,0x3A,0x5B, 0x5E,0x79,0x66, 0x5D,0x59,0x75, 0x5B,0x27,0x4C, 
        0x42,0x76,0x45, 0x60,0x63,0x76, 0x23,0x62,0x2A, 0x65,0x4D,0x43, 
        0x5F,0x51,0x33, 0x7E,0x53,0x42, 0x4F,0x52,0x20, 0x52,0x20,0x63, 
        0x7A,0x26,0x4A, 0x21,0x54,0x5A, 0x46,0x71,0x38, 0x20,0x2B,0x79, 
        0x26,0x66,0x32, 0x63,0x2A,0x57, 0x2A,0x58,0x6C, 0x76,0x7F,0x2B, 
        0x47,0x7B,0x46, 0x25,0x30,0x52, 0x2C,0x31,0x4F, 0x29,0x6C,0x3D, 
        0x69,0x49,0x70, 0x3F,0x3F,0x3F, 0x27,0x78,0x7B, 0x3F,0x3F,0x3F, 
        0x67,0x5F,0x51, 0x3F,0x3F,0x3F, 0x62,0x29,0x7A, 0x41,0x24,0x7E, 
        0x5A,0x2F,0x3B, 0x66,0x39,0x47, 0x32,0x33,0x41, 0x73,0x6F,0x77, 
        0x4D,0x21,0x56, 0x43,0x75,0x5F, 0x71,0x28,0x26, 0x39,0x42,0x78, 
        0x7C,0x46,0x6E, 0x53,0x4A,0x64, 0x48,0x5C,0x74, 0x31,0x48,0x67, 
        0x72,0x36,0x7D, 0x6E,0x4B,0x68, 0x70,0x7D,0x35, 0x49,0x5D,0x22, 
        0x3F,0x6A,0x55, 0x4B,0x50,0x3A, 0x6A,0x69,0x60, 0x2E,0x23,0x6A, 
        0x7F,0x09,0x71, 0x28,0x70,0x6F, 0x35,0x65,0x49, 0x7D,0x74,0x5C, 
        0x24,0x2C,0x5D, 0x2D,0x77,0x27, 0x54,0x44,0x59, 0x37,0x3F,0x25, 
        0x7B,0x6D,0x7C, 0x3D,0x7C,0x23, 0x6C,0x43,0x6D, 0x34,0x38,0x28, 
        0x6D,0x5E,0x31, 0x4E,0x5B,0x39, 0x2B,0x6E,0x7F, 0x30,0x57,0x36, 
        0x6F,0x4C,0x54, 0x74,0x34,0x34, 0x6B,0x72,0x62, 0x4C,0x25,0x4E, 
        0x33,0x56,0x30, 0x56,0x73,0x5E, 0x3A,0x68,0x73, 0x78,0x55,0x09, 
        0x57,0x47,0x4B, 0x77,0x32,0x61, 0x3B,0x35,0x24, 0x44,0x2E,0x4D, 
        0x2F,0x64,0x6B, 0x59,0x4F,0x44, 0x45,0x3B,0x21, 0x5C,0x2D,0x37, 
        0x68,0x41,0x53, 0x36,0x61,0x58, 0x58,0x7A,0x48, 0x79,0x22,0x2E, 
        0x09,0x60,0x50, 0x75,0x6B,0x2D, 0x38,0x4E,0x29, 0x55,0x3D,0x3F,
		0x51,0x67,0x2f
} ;

const unsigned char pick_encoding[64] = {
1, 2, 0, 1, 2, 0, 2, 0, 0, 2, 0, 2, 1, 0, 2, 0, 
1, 0, 2, 0, 1, 1, 2, 0, 0, 2, 1, 0, 2, 0, 0, 2, 
1, 1, 0, 2, 0, 2, 0, 1, 0, 1, 1, 2, 0, 1, 0, 2, 
1, 0, 2, 0, 1, 1, 2, 0, 0, 1, 1, 2, 0, 1, 0, 2
};

unsigned char transformed[3][127];
int digits[0x7a];

int urlencoded = 0;
int htmlencoded = 0;
int verbose = 0;
int smart = 1;

unsigned char unescape (unsigned char c)
{
	static unsigned char escapes[] = "#&!*$";
	static unsigned char escaped[] = "\r\n<>@";
	int i=0;

	if (c > 127)
		return c;
	while (escapes[i])
	{
		if (escapes[i] == c)
			return escaped[i];
		i++;
	}	
	return '?';
}

void maketrans (void)
{
	int i, j;

	for (i=0; i<32; i++)
		for (j=0; j<3; j++) 
			transformed[j][i] = i;

	for (i=31; i<=127; i++)
		for (j=0; j<3; j++) 
			transformed[j][rawData[(i-31)*3 + j]] = (i==31) ? 9 : i;
}

void makedigits (void)
{
	int i;

	for (i=0; i<26; i++)
	{
		digits['A'+i] = i;
		digits['a'+i] = i+26;
	}
	for (i=0; i<10; i++)
		digits['0'+i] = i+52;
	digits[0x2b] = 62;
	digits[0x2f] = 63;
}

unsigned long int decodeBase64 (unsigned char *p)
{
	unsigned long int val = 0;

	val +=  (digits[p[0]] << 2);
	val +=  (digits[p[1]] >> 4);
	val +=  (digits[p[1]] & 0xf) << 12;
	val += ((digits[p[2]] >> 2) << 8); 
	val += ((digits[p[2]] & 0x3) << 22);
	val +=  (digits[p[3]] << 16);
	val += ((digits[p[4]] << 2) << 24);
	val += ((digits[p[5]] >> 4) << 24);

	/* 543210 543210 543210 543210 543210 543210

	   765432 
	          10
	                 ba98
	            fedc
	                     76
	                        543210
                                   fedcba 98----
       |- LSB -||-     -||-     -| |- MSB -|
	*/
	return val;
}

/*
 Char. number range  |        UTF-8 octet sequence
      (hexadecimal)    |              (binary)
   --------------------+---------------------------------------------
   0000 0000-0000 007F | 0xxxxxxx
   0000 0080-0000 07FF | 110xxxxx 10xxxxxx
   0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
   0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
*/

int isLeadByte (unsigned int cp, unsigned char ucByte)
{
	/* Code page 932 - Japanese Shift-JIS       - 0x81-0x9f 
	                                              0xe0-0xfc 
                 936 - Simplified Chinese GBK   - 0xa1-0xfe
	             949 - Korean Wansung           - 0x81-0xfe
				 950 - Traditional Chinese Big5 - 0x81-0xfe 
	            1361 - Korean Johab             - 0x84-0xd3 
												  0xd9-0xde
												  0xe0-0xf9 */
	switch (cp)
	{
		case 932:
			if ((ucByte > 0x80) && (ucByte < 0xa0))	return 1;
			if ((ucByte > 0xdf) && (ucByte < 0xfd))	return 1;
			else return 0;
		case 936:
			if ((ucByte > 0xa0) && (ucByte < 0xff)) return 1;
			else return 0;
		case 949:
		case 950:
			if ((ucByte > 0x80) && (ucByte < 0xff)) return 1;
			else return 0;
		case 1361:
			if ((ucByte > 0x83) && (ucByte < 0xd4)) return 1;
			if ((ucByte > 0xd8) && (ucByte < 0xdf)) return 1;
			if ((ucByte > 0xdf) && (ucByte < 0xfa)) return 1;
			else return 0;
		default:
			return 0;
	}

}


struct entitymap {
	char *entity;
	char mappedchar;
};

struct entitymap entities[] = {
	{"excl",33},{"quot",34},{"num",35},{"dollar",36},{"percent",37},
	{"amp",38},{"apos",39},{"lpar",40},{"rpar",41},{"ast",42},
	{"plus",43},{"comma",44},{"period",46},{"colon",58},{"semi",59},
	{"lt",60},{"equals",61},{"gt",62},{"quest",63},{"commat",64},
	{"lsqb",91},{"rsqb",93},{"lowbar",95},{"lcub",123},{"verbar",124},
	{"rcub",125},{"tilde",126}, {NULL, 0}
};


char decodeMnemonic ( unsigned char *mnemonic)
{
	int i=0;
	while (entities[i].entity != NULL)
	{
		if (strcmp(entities[i].entity, mnemonic)==0)
			return entities[i].mappedchar;
		i++;
	}
	printf ("Warning: did not recognize HTML entity '%s'\n", mnemonic);
	return '?';
}

int ScriptDecoder (unsigned char *inname, unsigned char *outname, unsigned int cp)
{
	unsigned char inbuf[LEN_INBUF+1];
	unsigned char outbuf[LEN_OUTBUF+1];
	unsigned char c, c1, c2, lenbuf[7], csbuf[7], htmldec[8];
	unsigned char marker[] = "#@~^";
	int ustate, nextstate, state = 0;
	int i, j, k, m, ml, hd = 0;
	int utf8 = 0;
	unsigned long int csum = 0, len = 0;
	FILE *infile, *outfile;

	infile = fopen ((const char*)inname, "rb");
	outfile = fopen ((const char*)outname, "wb");
	if (!infile || !outfile)
	{
		printf ("Error opening file!\n");
		return 10;
	}
	
	maketrans();
	makedigits();
	memset (inbuf, 0, sizeof (inbuf));
	memset (outbuf, 0, sizeof (outbuf));
	memset (lenbuf, 0, sizeof (lenbuf));
	
	state = STATE_INIT_COPY;
	i = 0;
	j = 0;

	while (state)
	{
		if (inbuf[i] == 0)
		{
			if (feof (infile))
			{
				if (len) 
				{
					printf ("Error: Premature end of file.\n");
					if (utf8>0)
						printf ("Tip: The file seems to contain special characters, try the -cp option.\n");
				}
				break;
			}

			memset (inbuf, 0, sizeof (inbuf));
			fgets ((char*)inbuf, LEN_INBUF, infile);
			i = 0;
			continue;
		}

		if (j == LEN_OUTBUF)
		{
			fwrite (outbuf, sizeof(char), j, outfile);
			j = 0;
		}

		if ((urlencoded==1) && (inbuf[i]=='%'))
		{
			ustate = state;				/* save state */
			state = STATE_URLENCODE_1;	/* enter decoding state */
			i++;						/* flush char */
			continue;
		}

		/* 2 means we do urldecoding but wanted to avoid decoding an
			already decoded % for the second time */

		if (urlencoded==2)
			urlencoded=1;
		
		if ((htmlencoded==1) && (inbuf[i]=='&'))
		{
			ustate = state;
			state = STATE_HTMLENCODE;
			hd = 0;
			i++;
			continue;
		}

		/* 2 means we do htmldecoding but wanted to avoid decoding an
			already decoded & for the second time */

		if (htmlencoded==2)
			htmlencoded=1;

		switch (state)
		{
			case STATE_INIT_COPY: 
				ml = strlen ((const char*)marker);
				m = 0;
				state = STATE_COPY_INPUT;
				break;

			/* after decoding a block, we have to wait for the current 
			   script block to be closed (>) */
		
			case STATE_WAIT_FOR_CLOSE:
				if (inbuf[i] == '>')
					state = STATE_WAIT_FOR_OPEN;
				outbuf[j++] = inbuf[i++];
				break;

			/* and a new block to be opened again (<) */
			case STATE_WAIT_FOR_OPEN:
				if (inbuf[i] == '<')
					state = STATE_INIT_COPY;
				outbuf[j++] = inbuf[i++];
				break;

			case STATE_COPY_INPUT:
				if (inbuf[i] == marker[m])
				{
					i++;
					m++;
				}
				else
				{
					if (m)
					{
						k = 0;
						state = STATE_FLUSHING;
					}
					else
						outbuf[j++] = inbuf[i++];

				}
				if (m == ml)
					state = STATE_INIT_READLEN;
				break;

			case STATE_FLUSHING:
				outbuf[j++] = marker[k++];
				m--;
				if (m==0)
					state = STATE_COPY_INPUT;
				break;
			
			case STATE_SKIP_ML: 
				i++;
				if (!(--ml))
					state = nextstate;
				break;


			case STATE_INIT_READLEN: 
				ml = 6;
				state = STATE_READLEN;
				break;

			case STATE_READLEN: 
				lenbuf[6-ml] = inbuf[i++];
				if (!(--ml))
				{
					len = decodeBase64 (lenbuf);
					if (verbose)
						printf ("Msg: Found encoded block containing %d characters.\n", len);
					m = 0;
					ml = 2;
					state = STATE_SKIP_ML;
					nextstate = STATE_DECODE;
				}
				break;

			case STATE_DECODE: 
				if (!len)
				{
					ml = 6;
					state = STATE_CHECKSUM;
					break;
				}
				if (inbuf[i] == '@') 
					state = STATE_UNESCAPE;
				else
				{
					if ((inbuf[i] & 0x80) == 0)
					{
						outbuf[j++] = c = transformed[pick_encoding[m%64]][inbuf[i]];
						csum += c;
						m++;
					}
					else 
					{
						if (!cp && (inbuf[i] & 0xc0)== 0x80) 
						{
							// utf-8 but not a start byte
							len++;
							utf8=1;
						}
						outbuf[j++] = inbuf[i];
						if ((cp) && (isLeadByte (cp,inbuf[i])))
							state = STATE_DBCS;
					}
				}
				i++;
				len--;
				break;

			case STATE_DBCS:
				outbuf[j++] = inbuf[i++];
				state = STATE_DECODE;
				break;
				
			case STATE_UNESCAPE: 
				outbuf[j++] = c = unescape (inbuf[i++]);
				csum += c;
				len--;
				m++;
				state = STATE_DECODE;
				break;

			case STATE_CHECKSUM: 
				csbuf[6-ml] = inbuf[i++];
				if (!(--ml))
				{
					csum -= decodeBase64 (csbuf);
					if (csum)
					{
						printf ("Error: Incorrect checksum! (%lu)\n", csum);
						if (cp)
							printf ("Tip: Maybe try another codepage.\n");
						else
						{
							if (utf8>0)
								printf ("Tip: The file seems to contain special characters, try the -cp option.\n");
							else
								printf ("Tip: the file may be corrupted.\n");
						}
						csum=0;
					}
					else 
					{
						if (verbose)
							printf ("Msg: Checksum OK\n");
					}
					m = 0;
					ml = 6;
					state = STATE_SKIP_ML;
					if (smart)
	 					nextstate = STATE_WAIT_FOR_CLOSE;
					else 
						nextstate = STATE_INIT_COPY;
				}
				break;

			/* urlencoded, the first character */
			case STATE_URLENCODE_1:
				c1 = inbuf[i++] - 0x30;
				if (c1 > 0x9) c1-= 0x07;
				if (c1 > 0x10) c1-= 0x20;
				state = STATE_URLENCODE_2;
				break;

			/* urlencoded, second character */
			case STATE_URLENCODE_2:
				c2 = inbuf[i] - 0x30;
				if (c2 > 0x9) c2-= 0x07;
				if (c2 > 0x10) c2-= 0x20;
				inbuf[i] = c2 + (c1<<4);	/* copy decoded char back on input */
				urlencoded=2;				/* avoid looping in case this was an % */
				state = ustate;				/* restore old state */
				break;

			/* htmlencoded */
			case STATE_HTMLENCODE:
				c1 = inbuf[i];
				if (c1 != ';')
				{
					i++;
					htmldec[hd++] = c1;
					if (hd>7)
					{
						htmldec[7]=0;
						printf ("Error: HTML decode encountered a too long mnemonic (%s...)\n", htmldec);
						exit(10);
					}
				}
				else /* ';' = end of mnemonic */
				{
					htmldec[hd] = 0;
					inbuf[i] = decodeMnemonic (htmldec); /* skip the & */
					htmlencoded = 2;		/* avoid looping in case of & */
					state = ustate;
				}
				break;
			default:
				printf ("Internal Error: Invalid state: %d\n", state);
				break;
		}
	}
	
	fwrite (outbuf, sizeof (char), j, outfile);
	fclose (infile);
	fclose (outfile);
	return 0;
}


int main (int argc, char **argv)
{
	int i, cp = 0;

	if (argc < 3)
	{
		puts ("ScrDec v1.8 - Decoder for Microsoft Script Encoder\n"
			"(c)2000-2005 MrBrownstone, mrbrownstone@ virtualconspiracy.com\n"
			"Home page: http://www.virtualconspiracy.com/scrdec.html\n\n"
			"Usage: scrdec18 <infile> <outfile> [-cp codepage] [-urldec|-htmldec]\n"
			"  [-verbose] [-dumb]\n\n"
			"Code pages can be 932 - Japanese\n"
			"                  936 - Chinese (Simplified)\n"
			"                  950 - Chinese (Traditional)\n"
			"                  949 - Korean (Wansung)\n"
			"                 1361 - Korean (Johab)\n"
			"Any other code pages don't need to be specified.\n\n"
			"Use -urldec to unescape %xx style encoding on the fly, or\n"
			" -htmldec to unescape &amp; style encoding.\n"
			"For extra information, add the -verbose switch\n"
			"You might not want to use the smart HTMLGuardian defeation mechanism.\n"
			"  In that case, add the -dumb switch.\n");
		return 10;
	}

	i=3;
	while (i<argc)
	{
		if (strcmp (argv[i], "-cp")==0)
		{
			i++;
			if (i<argc) cp = atoi (argv[i]);
			else
			{
				puts ("-cp should be followed by a code page identifier");
				return 10;
			}
		}
		else
		if (strcmp (argv[i], "-urldec")==0)
			urlencoded = 1;
		else
		if (strcmp (argv[i], "-htmldec")==0)
			htmlencoded = 1;
		else
		if (strcmp (argv[i], "-verbose")==0)
			verbose = 1;
		else
		if (strcmp (argv[i], "-dumb")==0)
			smart = 0;
		i++;
	}
	return ScriptDecoder ((unsigned char*)argv[1], (unsigned char*)argv[2], cp);
}

Code: scrdec18.c

Code: scrdec18.exe  Windows Script Decoder v1.8  (52k EXE)  (Manual)

Usage

Script Decoder is a simple Win32 command line executable that is used as follows:
scrdec <infile> <outfile> [<code page>]
There's no fancy stuff like wildcard support or overwriting the input file with the output file. If you really want to do that, just use the DOS for command:
md decoded
for %a in (*.asp) do scrdec %a decoded\%a
del *.asp
move decoded\*.asp .
rd decoded
Note that the FOR command does not support long file names.
Make backups for safety!

After running the decoder, you'll see that all garbled blocks of script, like
<%#@~^swIAAA==@#@&@!Z OJz@#@&zJ;/WaX.kTtO©~8,,R~HbmDKdG0DP;W.wG.mYrW
PzVs~"ko4OkP]+knM\n9R@#@&0!x1OkKx~DrWHZWM.+1YAMGADv#`@#@&~,kW`%>
will have been decoded into their original form. Please note that you will still have to manually strip the '.Encode' out of the 'JScript.Encode' and 'VBScript.Encode' in the <script language=".."> tags!

The script decoder recognises all encoded blocks that start with the sequence #@~^, so it will correctly decode 'plain files' that only contain script, ASP style blocks <%script%>, and <script language="...">script<script> blocks.

Starting with version 1.2, the script decoder has the ability to use different code pages so it can decode scripts that contain Asian characters. If you want to decode such scripts, just supply the code page identifier as the third parameter.
   Id   / Code Page
   932  / Japanese
   936  / Chinese (Simplified)
   950  / Chinese (Traditional)
   949  / Korean (Wansung)
   1361 / Korean (Johab)
   
If you want to decode scripts using another code page, then you do not need to specify it.



Alternatively, you can use this little VBScript that Gene Naftulyev sent me (thanks dude!), or the further improved version by Ninio Erez.

'Script created and copyright by Gene Naftulyev 
'Free use granted as long as this notice is included.

WScript.echo "This script will decode all *.asp files in the same directory as
itself and create a Decoded folder. If the script fails be sure you have
scrdec14.exe in the same directory."

Set fs = CreateObject("Scripting.FileSystemObject")
Set WshShell = WScript.CreateObject("WScript.Shell")
Set folder1 = fs.GetFolder(left( WScript.ScriptFullName ,inStrRev( WScript.ScriptFullName,"\")))
if not fs.folderexists(left( WScript.ScriptFullName ,inStrRev( WScript.ScriptFullName,"\"))&"Decoded") then
set folder2 = fs.createfolder("Decoded")
end if
For Each FileName In folder1.files
if inStr(ucase(FileName.name),".ASP" ) then
intReturn = WshShell.Run("cmd /c scrdec14 " & FileName.name & " Decoded\" & FileName.name, 7, FALSE)
end if
Next

Dim WshShell, BtnCode

Set WshShell = WScript.CreateObject("WScript.Shell")
Set objArgs = WScript.Arguments

if objArgs.Count>0 then
BtnCode = WshShell.Popup("Do you want to decode those files?", 10, "Answer This Question:", 1 + 32)
Select Case BtnCode
   case 1
        WScript.echo "This script will decode " & objArgs.Count & " *.asp
files in the same directory as itself and create a Decoded folder. If the
script fails be sure you have scrdec14.exe in the same directory."
        call decode_arg
   case 2
        WScript.Echo "Action canceled."
   case -1
        WScript.Echo "Is there anybody out there?"
End Select
else
BtnCode = WshShell.Popup("Do you want to decode all the *.asp files?", 10, "Answer This Question:", 1 + 32)
Select Case BtnCode
   case 1
        WScript.echo "This script will decode all *.asp files in the same
directory as itself and create a Decoded folder. If the script fails be sure
you have scrdec14.exe in the same directory."
        call decode_all
   case 2
        WScript.Echo "Action canceled."
   case -1
        WScript.Echo "Is there anybody out there?"
End Select
end if

sub decode_all()
Set fs = CreateObject("Scripting.FileSystemObject")
Set WshShell = WScript.CreateObject("WScript.Shell")
Set folder1 = fs.GetFolder(Left(WScript.ScriptFullName, InStrRev(WScript.ScriptFullName, "\")))
If Not fs.folderexists(Left(WScript.ScriptFullName, InStrRev(WScript.ScriptFullName, "\")) & "Decoded") Then
Set folder2 = fs.createfolder("Decoded")
End If
For Each FileName In folder1.Files
  If InStr(UCase(FileName.Name), ".ASP") Then
    intReturn = WshShell.Run("cmd /c scrdec14 " & FileName.Name & " Decoded\" & FileName.Name, 7, False)
  End If
Next
end sub

sub decode_arg()
Set fs = CreateObject("Scripting.FileSystemObject")
Set WshShell = WScript.CreateObject("WScript.Shell")

If Not fs.folderexists(Left(WScript.ScriptFullName, InStrRev(WScript.ScriptFullName, "\")) & "Decoded") Then
Set folder2 = fs.createfolder("Decoded")
End If
For I = 0 to objArgs.Count - 1
intReturn = WshShell.Run("cmd /c scrdec14 " & objArgs(I) & " Decoded\" & objArgs(I), 7, False)
Next
end sub
Return to $2600 Index