Not quite satisfied, I did further digging in my free time and amidst an essay involving the topic. I have a picture that tell the story much faster, but can't upload atm =(
Part One:
(Irrelevant) Information about
.NET Strings and the #US Stream
The flow chart for how the memory location .70000001 is determined works like this.
First, the address .00402010 points to the MetadataRoot
Next, the MetadataRoot contains a stream header which points towards the #US stream (the "User String" stream)
The #US stream is an area that contains the string data we tend to be interested in. The file's #US location is mapped to .70000000
MetadataRoot
Let's begin by going over the intricacies of MetadataRoot. MetadataRoot is laid out as: (variable) irrelevant info + stream count + (variable) stream headers
StreamHeaders are laid out as: Offset, size, (variable) tag...
The offset of the #US stream points towards the token .7000001 which is interesting to us.
Because the irrelevant information has a variable size and the streamHeaders presumably have a variable size & layout as well, it is difficult to answer the question, Where is .70000000, but I will attempt to do so.
I've devised a flow chart for explaining where the string of a .NET assembly will be located. Assume numbers prefixed with '.' are addresses in hex. Addresses suffixed with (Xn) refers to a data block n bytes long beginning after the address. The suffix [I] would refer to a specific article of the data block. Consider the string pointer .70000001 (or rather simply 01, the first string)
(for -> read "leads to"
To find string 70000001...
.00402010(X2) -> .MetadataRoot
.MetadataRoot + 16 + m+x + 2 -> .StreamHeaderArray
.StreamHeaderArray(XN) -> AllStreamHeaders
AllStreamHeaders[I] -> .iStreamHeader
.iStreamHeader[offset] -> .70000001 note that: .iStreamHeader[BlockName] is iStream's Named Block
To reiterate our objective, we want iStreamHeader[offset] (within the #US Stream Header wherein the .70000001 address is indicated)
but first we logically need to find the .StreamHeaderArray, which is found by .MetadataRoot + 16 + m+x + 2
So we need to get
.MetadataRoot, found by .00402010(x2)
and m+x, found by V + (V -%4) But this is explained better down below. (hint: V is the Length of the .NET version string code (and + 1 for null), and -%4 refers to padding to 4 byte blocks).
Well, let's walk through the process in more elucidating detail.
MetaDataRoot...
.00402010: 4 byte offset pointing to MetadataRoot (read offset backwards)
.MetadataRoot = .00400000 + .00402010(x4) (thus)
Let's say for example .00402010(x4) = e8 23 ...then...
.MetadataRoot = say .004023E8
Great, that's the first step of the equation
StreamHeaders...
MetadataRoot contains a count of "streams" which we're somewhat interested in. We need to find the stream tagged with "#US" in Unicode which will also include a size chunk and a chunk pointing to where the strings section begins! Each stream header consists of 2 chunks plus n chunks that comprise the null terminated Unicode tag name... So the "#US" stream header actually consists of 12 bytes in total, whereas some other streams like "#strings" would consist of 2chk + ⌈ 9byte/4chk ⌉ or 6 chunks... 24bytes in all). (ref: ceiling function)
.StreamCount = .MetadataRoot + 16 + m+x + 2 |
...such that
.Streamcount is a 2byte count of the number of streams in the assembly, (the address to)
m is a block of chunks consisting of the .NET version in Unicode, and the
X being the padding of that block of chunks to ensure the bytes consumed by the version string are divisible by 4.
If the .NET version is "v4.0.30319" then m+x = 11+1 because... Let's try this notation, I find it handy atm and can't think of a better way to express this... If anyone can think of other, more concise ways, please share.
11 + (11 -% 4) is the same as 11 + -(11 mod 4) + 4 which happens to be the same as the ceiling function's notation 4byte/chk * ⌈ 11byte / 4chk ⌉ they all equal 12bytes
So with that discrete consideration behind us, the Unicode string plus the padding is explained as
m+x = V + (V -% 4)
x = (11 -% 4) = 1 bytes long
m+x = 11 + (11 -% 4) (don't forget the +1 for the null terminator of the string)
m+x = 12bytes
.StreamCount = .MetadataRoot + 16 + m+x + 2
.StreamCount = .MetadataRoot + 16 + 12 + 2
.StreamCount = .004023E8 + 30
.StreamCount = .00402406 ; remember there's x2 bytes of interesting data at that location
Let's just say that our Stream Count is five or
.StreamCount(x2) = 0x0500 or rather 0005 streams counted!
.MetadataRoot: contains a chunk of code containing the Unicode phrase "#US". The US stream is the stream that contains the theoretical .NET string that we're looking for. The phrase is found somewhere after .StreamCount+2 but before the end of the stream headers. End Of The Stream Headers is found by digging through each stream header...
Knowing that each stream header is simply: an Offset, size, and tag...
Code:
.CurrentHeader = .StreamCount + 2
Loop:
UnicodeTagI = .CurrentHeader + 8
.HeaderEnd = Read from UnicodeTagI Till 0x00
.HeaderEnd = .HeaderEnd + (TagLength -% 4)
Return the offset at .CurrentHeader(x4) if UnicodeTagI == "#US"
.CurrentHeader(x4) = .HeaderEnd
Jmp loop
...digging through as I said.
here's the white pages I referenced from the Partition II Metadata.doc ("http://download.microsoft.com/download/d/c/1/dc1b219f-3b11-4a05-9da3-2d0f98b20917/partition%20ii%20metadata.doc") See 24.2.1 and 24.2.2 especially.