Reversing a ZLib-Obfuscated? Network Protocol [Archive]

Matasano

December 2nd, 2007, 17:40

We just wrapped up a security assessment on a commercial enterprise server/agent security product. I can’t get too specific here, but we did run into an interesting problem that we thought would be worth a post.
The application we were evaluating had a home-grown network protocol doing some interesting things worth investigating. What we were seeing from our network capture wasn’t too far from this:

Code:

00 46414b45 02000000 06060601 5e000000 |FAKE............|

10 dab624ba da73fed5 b9872696 08ea97a5 |..$..s....&#038;.....|

20 2d626160 60c86248 61c86748 65e000b2 |-ba``.bHa.gHe...|

30 bd80ac0c 863c0605 0617b098 3450cc99 |.....<......4P..|

40 c18a2186 21802191 a1042817 c3100294 |..!.!.!...(.....|

50 89610806 92b94015 310c6e0c 990c3940 |.a....@.1.n...9@|

60 961e4332 502c8f21 0dc84f67 0000 |..C2P,.!..Og..|

6e

Just by glancing at the first 16 bytes, you can spot (1) a message signature; (2) some 4-byte little-endian word values, one of which was obviously a length value for the payload; and (3) version number of 1.6.6.6 in the middle.

This looked promising so, we decided to pick it apart some more and see where it got us.

Let me just add at this point: General approaches can vary a lot when it comes to reverse engineering. As you’ll see, what we were doing was not strictly just protocol reversing. We had access to server-side binaries, which we were simultaneously disassembling to guide us at several steps. We could have just gone the strict disassembly route, but in my experience combining the two tends to yield much quicker results.

So, away we went. Or rather, got stuck next. Just past the header of the protocol was a chunk of seemingly meaningless binary data. A bit of disassembling told us that it was something compressed with .NET’s DeflateStream. Here was the real payload and it was time to write our first bit of code.

Since we were working with BlackBag (as regular readers will have noticed — Matasano tends to do) our ideal tools would be small focused ones that could run on Unix. Preferably in the middle of a list of several piped commands so we could say things like:

Code:

% cat | _inflate_ | hexdump -C

And if things got interesting, maybe even:

Code:

% cat | _inflate_ | bkb sub | _deflate_ | bkb blit

We figured, we should be able to get the “Inflated” stream using Zlib. So, we set out to put together some Ruby to take a “deflated” standard input and dump “inflated” standard output.

Code:

#!/usr/bin/env ruby require 'zlib'

buf = STDIN.read() zs = Zlib::Inflate.new

out = zs.inflate buf STDOUT.write(out)

And… Fire!

Code:

% cat msg.raw |bkb shf 16 | inflate.rb|hd

./inflate.rb:7:in `inflate': incorrect header check (Zlib:ataError)

from ./inflate.rb:7

Woops… maybe not so simple. We asked the Google ("http://lists.ironpython.com/pipermail/users-ironpython.com/2006-October/003741.html")! Turned out .NET’s DeflateStream doesn’t use the usual ZLIB header and footer as defined in RFC 1950 ("http://www.faqs.org/rfcs/rfc1950.html").

Side note: Obviously this had already been tackled. Even though we didn’t try the IronPython solution linked above, I’d probably recommend using it or something like it unless you need something really quick and dirty as we did. The obvious question, is why didn’t we? We were sticking with ruby for other reasons on this session and didn’t really need a “robust” solution just yet.

So we actually read RFC 1950 at this point. Turned out we just needed to tack on the header (and maybe the footer) ourselves.

Code:

#!/usr/bin/env ruby

require 'zlib'

header = "x78x01"

buf = STDIN.read()

zs = Zlib::Inflate.new

# Add the header first

zs << header

out = zs.inflate buf

STDOUT.write(out)

Um.. Fire?

Code:

$ cat msg.raw |bkb shf 16 |./inflate.rb |hd 00 b6a45b7b 499fd59d c2917411 2f7666a2 |..[{I.....t./vf.|

10 04000000 6a006400 6f006500 08000000 |....j.d.o.e.....|

20 4a006f00 68006e00 20004400 6f006500 |J.o.h.n. .D.o.e.|

30 1b000000 43003a00 5c005000 61007400 |....C.:..P.a.t.|

40 68005c00 54006f00 5c005300 6f006d00 |h..T.o..S.o.m.|

50 65005c00 46006900 6c006500 2e006300 |e..F.i.l.e...c.|

60 6f006e00 66006900 6700 |o.n.f.i.g.|

6a

Much better.

Those who’ve read the RFC or are already familiar with ZLib may notice we didn’t bother with the ADLER32 checksum footer. Our quick/dirty Ruby ZLib implementation didn’t seem to notice when it was missing. Honestly not sure whether this is expected behavior or not, but it suited us just fine. We really just wanted to get back to picking apart the protocol.

What was “inflated” might also need to get “deflated” again, so we also whipped up a “deflater”.

Code:

#!/usr/bin/env ruby

require 'zlib'

buf = STDIN.read()

zs = Zlib:eflate.new()

out = zs.deflate(buf,Zlib::SYNC_FLUSH)

# Output the deflated chunk without the 2b zlib header and 4b adler32 footer

STDOUT.write(dst[2,(dst.length - 6)])

Turned out we didn’t need to use the “deflate” script much: between protocol decoding and disassembly, we learned one of the original uncompressed 4-bytes in the protocol’s header was for payload *type*, either *deflated* or *raw*. So, even though we confirmed our deflater worked well enough, we usually just changed the type to *raw* whenever we wanted to send something back to the server.

And in conclusion (which I have to speak in vague terms about to protect the guilty - sorry). Now that we could read and compose messages, we learned this protocol was letting the agent do some truly crazy things. Things like, passing entire lists of fields to insert/update directly into SQL. Without authentication.

Identifying and decompressing the protocol’s payload was the only hurdle we had to get over to proceed with other attacks. In the end, these culminated in several findings, including trivial database corruption and even injecting malicious data to capture admin privileges through the product’s console. Again… without authentication.

Moral of the story:

I try to not to speculate too much about what developers’ intentions are or were when I find something like this. Hindsight is 20/20 and it’s generally a lot easier to break than build. But, I couldn’t help but wonder whether they had intended to use DeflateStream as a cheap form of obfuscation here. It’s just as possible they just wanted to keep the payloads small and didn’t even consider the risks faced by the protocol at all.

Zlib is not encryption (I feel dumb even saying it). Even more so if your protocol is wide open. Authentication would have been tricky no matter what. There were inherent trust boundaries invading way into the agent. That was even more reason for this protocol to use encryption. Though frankly it wouldn’t have solved all this protocol’s problems — crypto is not an argent projectile. There were some deeper design issues lurking here.

But at the very least, it would have raised the bar for reversing. Because the ZLib “hurdle” took us all of about 20 minutes to beat.

http://www.matasano.com/log/862/reversing-a-zlib-obfuscated-network-protocol/