|
|
|
Copernic 2001 Pro (Version 5.0)
Light Version from: http://wwww.copernic.com/
[Use it to find its bigger brother ;)]
W32Dasm 8.93 - Recommended HexWorkshop - Essential Tool Filemon - Essential Tool C Compiler - Language for Tool Writing
I have been on a quest to find the query URL's and structure of queries as part
of my quest for data for my local search bot. After my last essay was finished
and the targets data has been extracted. With a fresh set of data in my hands,
I sat down and started writing a converter to put the data into a common file format.
This was where this essay begins, I had decided on a basic subset of the data
to use, but thought I should check it against other sources (in other bots),
first on the pile was webferret, a search-bot about which
Laurent has written and essay that you will find
here.
As is my usual trend I did not let the software within wire distance of the
internet, so did not get the updates and the dataset provided as standard is
pretty poor - so threw it in the bin.
Laurent had mentioned to me that I might find copernic interesting. Umm
Could this be a good target, I had heard of it, but had until recently steered
clear of all these search-bot programs. This was because I know you do not get anything
for nothing, and the thing that makes them money is knowing your searches, and
being able to make you sit through advert after advert after advert...
So off to the web, do a search for copernic and read some reviews. Seems like
another of these local search bots, where the main advantage is it knowing how
to talk to the search engines and co-ordinate the replies and present them to
the user in a nice simple way. This sounded interesting and it seemed to support
a large number of search engines but no specific numbers were given. I went to
some lengths to avoid visiting any of the copernic sites, for reasons, which will
become apparent later.
So the target was picked, next step was to go find it on the web.
So off to the web and Grabbed the Pro version, did not even go near their
site, so if they are busy checking logs you will not find me ;)
The Pro version came with a key - nice!
Out came the clean PC. This machine was not connected to any network or the internet,
after all we did not want any uncontrolled data to go out ;). Filemon was started
and left running and then copernic was installed on the pc. After the installation
the program was not run, and the installation process finished. The filemon log
of installation was then saved for later reference. So now to clear the Filemon log
and leave it running, to log files accessed by program.
Next step is to run the program and set it to point to the local proxy. Right - first
thing it does it ask you some registration details, when all data has been entered and
proxy set up it
tries to connect to get an update. [This is very optimistic of the company - that
all people who install and run it first time will be connected to the internet]
Right, so look at logs on proxy and there are a number of requests to "updates.copernic.com"
Now lets try a search, for 'searchlores' . At this point I know it is not going to get
any results, as the proxy does not connect to the internet, just returns 404 for every
request, as though routing was broken. So did the search. Look at proxy logs and in
amongst the requests for search engine pages, there is one that stands out to
"regcards.copernic.com".
Now follows an explanation of these requests, as they are quite interesting. They go
to the copernic.com domain so they must contain some user data or be used to track
users of this program in some way.
Firstly lets look at the update requests:HEAD http://updates.copernic.com/copernic2001upd/copernic2001plus.cui HTTP/1.1
This is the request sent:
HEAD http://updates.copernic.com/copernic2001upd/copernic2001plus.cui HTTP/1.1 Host: updates.copernic.com Accept: */* Connection: close User-Agent: Copernic Pragma: no-cacheSecond it does a : GET http://updates.copernic.com/copernic2001upd/copernic2001plus.cui HTTP/1.1
GET http://updates.copernic.com/copernic2001upd/copernic2001plus.cui HTTP/1.1 Host: updates.copernic.com Accept: */* Connection: close User-Agent: Copernic Pragma: no-cacheWhy do a HEAD, if when it fails you go on to do the GET anyway, why not simply do a GET, this seems very pointless ;)
GET http://www.copernic.com/cgi-bin/nph-osnvs2.pl?ns=##########################&iu=%7B********-****-****-****-************%7D&lo=http://updates.copernic.com/copernic2001upd/copernic2001plus.cui&cl=0 HTTP/1.1 Host: www.copernic.com Accept: */* Connection: close User-Agent: Copernic Pragma: no-cacheThe field marked with '*'s will be explained in the next request as it is a common parameter which is passed in both requests. The field marked with '#'s also seems to be a number of some form to be sent to their server.
Now lets look at the regcard information: POST http://regcards.copernic.com/cgi-bin/regcard HTTP/1.1
This is the request sent:
POST http://regcards.copernic.com/cgi-bin/regcard HTTP/1.1 Host: regcards.copernic.com Accept: */* Connection: close User-Agent: Copernic Content-Type: application/x-www-form-urlencoded Content-Length: 129 %5Ejohndoe%40mort.somewhere%5EUnited%20States%5E12345%5E0%5E0%5EENGPRO%5E5001%5E%********-****-****-****-************%7D%5EFrom%20web%20site%5E%5E0%5EJohn%20DoePlain text of last line: ^johndoe@mort.somewhere^United States^12345^0^0^EENGPRO^5001^{********-****-****-****-************}^From the web site^^0^John Doe
Value | Description |
johndoe@mort.somewhere | Email Address |
United States | Country |
12345 | Zip Code |
0 | Unknown |
0 | Unknown |
ENGRPRO | Version of Software |
5001 | Registration Card Version |
{********-****-****-****-************} | GUID |
from web site | Referrer for Product |
Unknown | |
0 | Unknown |
John Doe | Username |
"http://regcards.copernic.com/cgi-bin/regcard" "http://updates.copernic.com/copernic2001upd/" "http://www.copernic.com/cgi-bin/nph-osnvs2.pl" "www.copernic.com"The first ones can be nullified by writing "http://127.0.0.1/" at the start of the strings. This then will prevent all accesses to their servers. This is a good alternative to the hosts file, as the program seems to bypass the hosts if using a proxy and just sends the requests straight to the proxy.
So next step is to close the program, save the filemon log and have a look around my system.
I had a browse through the install filemon log file and made a note of the location of files
added to my system. The first thing that hit me was a load of '.csf' files which had
the names of search engines, and a list of '.ssf' files which seemed to represent
categories.
The next thing is to look at the run filemon log, it seems to read the .ssf and .csf files
and then create a set of files, under the directory 'data' which seems to be a user profile
with the users name as the folder name. Ummm, so some kind of translation or copying going
on, but a lot fewer files get written than read.
So to open up the main executable in our favourite hex viewer and have a quick browse, but
first to extract all the strings from the file. Had a browse through the strings and it
looks like it was coded in DELPHI. This was just a hunch and I remembered having a copy of
DFM-Explorer around , so tried it on the file and sure enough out came all the resources,
so it is for sure delphi. so the task is now to find a delphi decompiler. My thinking here
was that even though it might not be needed, if it is then it might make the program code
a bit easier to understand. Also better to check this option to start with rather than
later. As a teacher once told me "Always get all your tools ready before starting any task!"
The catch is : this is a delphi application, warning bloatware imminent. I had thought that
the executable was a bit on the large side for something so seemingly simple, and this explained
it. No extra DLL's or files, so the delphi libs must be statically linked. I remember when
applications used to fit on a floppy, now the icon files will not ;(.
First step is to grab ye ole webbrowser and search for a delphi decompiler (I must admit shame
and say I had never used one before). Right the one that pops up the most in the list when
ranked is 'DeDe' by DaFixer!. Ok so lets grab it and let it rip.
A few sips of my drink later and it has finished downloading, so lets run it and see what
it comes up with. DeDe recognises the file and does its stuff, and yes it is delphi because
I now have the forms and pascal code nicely disassembled on my HD. So a quick browse through
them to get an idea of the structure. umm
I noticed that DeDe also supports exporting all its references to a W32dasm project. Since
one of the steps I was going to do was to disassemble the file, I ran Wdasm and generated
a project file, then pointed DeDe to it and let it do its stuff. Hopefully when it finishes
it will leave a nice big file with the combined references, so that should make life easier
later on. Being able to see the references to the Pascal and Delphi bits should make the code
a bit easier to follow.
While that was running (it takes some time) my next step was to search all the .pas files
for references to 'ssf' and 'csf' to find where it loaded the data files, I did not find
any references of these strings in any of the .pas files. Ok time to load up the W32Dasm
project and have a look in that file. OK PROBLEM! - the project is still being accessed
during the combining of references, so that option is out for an hour or so, as it seems
to take quite some time (35Mb File to process).
So lets have a look around, there are some DLL's in the directory, so lets check them out:
c4dll.dll is Database Engine Library (Sequiter CodeBase Components for Delphi)
xcdunz32.dll is a Zip Library [Xceed Zip Compression Library]
SSCE5253.dll is the Sentry Spelling-Checker Engine [Wintertree Software]
Zip Library - is this just there for the installation or unpacking updates, or might it
be used on the data files? Time to check, if the data files are zipped then they should
be fairly easy to unpack. That would make life very easy ;)
So lets look at the files that were generated when the program was run, the files in
what looked like a profile directory.
channel.ctb seems the most likely candidate, and matches (by some coincidence) roughly
the size of all the .ssf and .csf files. (1,158,690 bytes)
All .ssf - category files (73,718 bytes). All .csf - engine files (1,131,657 bytes)
This seems a strange coincidence, as opening up this file shows it does have the engine
names and the category names (from filenames) but also contains a LOT of space characters,
so given this is in a directory called after the user, this should be the users preferences
for searches or something similar.
Back to the data files, as the only files looking good candidates are the '*.*sf' files
which fit the bill perfectly. So opened one up in notepad and it looks unreadable.
So right, copied three .ssf and three .csf files of different sizes to a temporary
directory to start looking at them. Opened the first one in a hex viewer and noticed
that it is not plain text, ok so it was expected they would be packed or encrypted
in some way, they would not leave their whole product out in the open. But one thing
that did jump out was the pattern of the characters.
Here is an excerpt from one of the files: (Boxes are unprintable characters)
Ssôsôßxøù?Ûyø[òSSsðS3ðSrQSSSSSSôSSsóù¹x;ÛùôSSss¾= ¸˜›òÓSSð'3rrrQPSsðS3ðSrÐQsS3³órpôSSssÝx;[Øyzœyøùòs3| ˜ÙxyôSSss\¸øù_y¹ùòXÛÛ[°°»»»yØÛy›x;Ûyx˜°ôSSss?û[[¸Ûœ ù»òßûùôSSss=yÛù¹¸zòßXù¿ù|˜ôSSssŸù;x¸˜ò3SSSôSSssxØù Ÿù;x¸˜òôSSss}ûÛ¸ÿ[ÙyÛùòßûùôSSss=Xy˜˜ùØ?ùÛòßXù¿ù|˜ôS Sss=Xy˜˜ùØ?ùÛ3òßXù¿ù|˜ôSSs"ôSSsó|˜xÛôSSss?¸û9ùÿØòXÛ Û[°°;ùy9Xs3x˜Ùxy9¸ø°9¹xðx˜°{ûùz°yÙ›;99¹xôSSssù;û ØÛ;_ù_y¹ùòsSôSSssÿ;ù}¹ù˜Ûòü¸xØØy°ÓSQP9¸ø[yÛxØù2Qü?|ýQ ÓSs2Q¿x˜Ù¸»;QrRpôSSssûØù;Py˜¹ùp?ÛyÛüy8ùòÒØ¸98{û ¸Ûù'ôSSssûØù;Py˜¹ùpý˜Ùüy8ùòÒ™¸˜ÛQ™y9ùò0=XP3Óp0} xyØ0=XP3Óp0Q;xùò0=XP3Óp0ðs0=XP3Óp0'ôSSssûØù;P }ÙÙù;;p<ùzòßûùôSSssûØù;PßxÛØùp?ÛyÛüy8ùòÒxø¹Q;9Notice the repeated 'SS','SSs' and 'SSss' sequences. Instinct at this point says that this is not a packed file as these repeats would have been eliminated by the compression process. There are other repeated sequences present in the encoded text.
9D9D5373F41473F414DF78F8F93FDB79 F85BF213535373F05333F073F3515353 125353125353F414535373F31FF9B978 3BDBF91BF41453537373BE3DB8989BF2 11D3535313F0923372727251505373F0 5333F073 . . . . (more data) F414This is also the same in Buysoftware which is a 2k file, apart from one byte
9D9D5373F41473F414DF78F8F93FDB79 F85BF213535373F05333F073 72 [changed F3 to 72] 515353 125353125353F414535373F31FF9B978 3BDBF91BF41453537373BE3DB8989BF2 11D3535313F0923372727251505373F0 5333F073 . . . . (more data) F414This seems the only difference but is not the same in all 2k files...
in the copernic.csf file it is: 9D9D5373F41473F414DF78F8F93FDB79 F85BF213535373F05333F0 53 [changed 73 to 53] 72 [changed F3 to 72] 515353 125353125353F414535373F31FF9B978 3BDBF91BF41453537373BE3D . . . . (more data) F414different after this..
Ss s ßxøù?Ûyø[òSSsðS3ðSrQSSSSSS SSsóù¹x;Ûù SSss¾=¸˜›òÓSSð'3rrrQPSsðS3ðSrÐQsS3³órp SSssÝx;[Øyzœyøùòs3|˜Ùxy SSss\¸øù_y¹ùòXÛÛ[°°»»»yØÛy›x;Ûyx˜° SSss?û[[¸Ûœù»òßûù SSss=yÛù¹¸zòßXù¿ù|˜ SSssŸù;x¸˜ò3SSS SSssxØùŸù;x¸˜ò SSss}ûÛ¸ÿ[ÙyÛùòßûù SSss=Xy˜˜ùØ?ùÛòßXù¿ù|˜ SSss=Xy˜˜ùØ?ùÛ3òßXù¿ù|˜ SSssûØù;Py˜¹ùpý˜Ùüy8ùòÒ™¸˜ÛQ™y9ùò0=XP3Óp0}xyØ0=XP3Óp0Q;xùò0=XP3Óp0ðs0=XP3Óp0' SSssûØù;P}ÙÙù;;p<ùzòßûù SSssûØù;PßxÛØùp?ÛyÛüy8ùòÒxø¹Q;9ò0=XP3Óp0°xøy¹ù;°[y¹ù¹x™0=XP3Óp0QXùx¹XÛòs3Q»xÙÛXòs3' SSssûØù;PßxÛØùpý˜Ùüy8ùòÒ°y'This seems to fit the structure of a configuration file, short line lengths. Later in the file are longer lines, about the size of a query URL, so this seems right ;) There is also a pattern to the characters at the start of the line, and notable is that the repeated 'SS' combination appears at the end of strings - this means (hopefully) that it is not a position dependent (or offset) substitution.
DeDe has now finished, so we can start looking at the assembler for the file. First task
is to hunt down the references to any .ssf or .csf files. When looking through the file you
will find a few references to this string. These were used as a starting point and breakpoints
were set on them.
I shall take a wander here - bear with me! When I started looking at DeDe, I was intending
to work from the disassembled files and track through the code in order to find the
decryption routine which would restore the files to plaintext. Now my priorities had
changed somewhat, what I was now after was a portion of the plaintext file and hopefully
all of one of the files in memory so that it could be saved. The fact that the cipher
seemed to be a substitution one from the data shown above means that although to find
the decryption routine would be nice, to find a portion of the plaintext would be just
as nice in helping find the result. If they have used a table then hopefully once we
have a portion of the plaintext and what it maps to in the encrypted file, finding the
table in memory would be very easy. This seems a nicer and quicker approach that
reading through page after page of disassembled code trying to put it together. This
point is made more by the fact that the app is in delphi, so a simple instruction
could quite easily call many functions all over the place.
So trying to stop the urge to go through the code and reassemble what happens, which
is very hard. I start the code running in W32Dasm with breakpoints set on every
instance of a string that ends in '.ssf' and '.csf'. It soon breaks on one of them.
At this point I set auto-api stop, and show parameters for local and system calls
and set it running again. What I am hoping for is one of the calls to have a
pointer to the plaintext in the call to it.
Here is the bit of code that loads 'Copernic.csf', which is thought to be the
master configuration file.
* Possible StringData Ref from Code Obj ->"Copernic.csf" | :52A00A BAB8A75200 mov edx, 52A7B8 :52A00F E8FCA0EDFF call 404110 :52A014 8B55E0 mov edx, dword ptr [ebp-20] :52A017 8B45FC mov eax, dword ptr [ebp-04] :52A01A 8B4020 mov eax, dword ptr [eax+20] :52A01D 8B08 mov ecx, dword ptr [eax] :52A01F FF5158 call [ecx+58] :52A022 8B45FC mov eax, dword ptr [ebp-04] :52A025 8B4020 mov eax, dword ptr [eax+20] // This following call seems to handle the // file and contains a call which exposes the // plaintext :52A028 E8970AFAFF call 4CAAC4 // HANDLEFILE :52A02D 85C0 test eax, eax :52A02F 7425 je 52A056 :52A031 6A00 push 0 :52A033 6A00 push 0 :52A035 A1C4255B00 mov eax, dword ptr [5B25C4] :52A03A 8B00 mov eax, dword ptr [eax] :52A03C 8B4050 mov eax, dword ptr [eax+50] :52A03F BA02000000 mov edx, 2The code below is the start of the HANDLEFILE routine:
* Referenced by a CALL at Addresses: |:4EB84D, :52A028, :599F7B, :59A81A :4CAAC4 55 push ebp . ... next part is further down the function. . :4CAAFA 8D55E8 lea edx, dword ptr [ebp-18] :4CAAFD 8B45FC mov eax, dword ptr [ebp-04] :4CAB00 8B08 mov ecx, dword ptr [eax] :4CAB02 FF511C call [ecx+1C] :4CAB05 8B45E8 mov eax, dword ptr [ebp-18] :4CAB08 BA01000000 mov edx, 1 // This function has the plain text for the // line from the file passed into and outof // it, so the decoding must happen before this!!! :4CAB0D E892EDFFFF call 4C98A4 // [ebp-10] points to the start of text, both into // and out of this function
* Possible StringData Ref from Code Obj ->"DisplayName" :599FA0 BA14A65900 mov edx, 59A614 :599FA5 8B45E4 mov eax, dword ptr [ebp-1C] :599FA8 E8AB4DF2FF call 4BED58 :599FAD 8D45C4 lea eax, dword ptr [ebp-3C] :599FB0 33D2 xor edx, edx :599FB2 E8B5B6E6FF call 40566C :599FB7 8D4DC4 lea ecx, dword ptr [ebp-3C] this code is repeated with the following string references: * Possible StringData Ref from Code Obj ->"Description" * Possible StringData Ref from Code Obj ->"HomePage"So this bit of code is parsing a file of some kind looking for the identifiers given in the string references, and so that means our file MUST contain some of the above strings, as they do not seem to be used in any other files.
So now we have a portion of the plaintext written down (or in a file)
and this looks very good, and seems to confirm a lot of things. The string
pointed to is shown below, and when looking for the first time you should
also refer back to the previous text and see what bells ring ;)
A portion of the plaintext:
FF01 0015Register 0011_Conv="4002->3999 (01-03-09, 10:37:59)" 0011DisplayName="123India" 0011HomePage="http://www.altavista.in/"The order is slightly changed from the order in the file (only a couple of entries swapped) but note the line lengths as these are a giveaway. So we now know for sure that we are on the right track - GOOD! Now you can call me stupid if you want, but '0011' looks a bit like 'SSss' and also the '001' would mean more with the 'SSs' occurences as well.
Encoded | Decoded | |||
10 | 2a | |||
11 | 22 | |||
12 | 3a | |||
13 | 32 | |||
14 | 0a | |||
Encoded | Decoded | |||
18 | 6a | f8 | 6d | |
19 | 62 | f9 | 65 | |
1a | 7a | fa | 7d | |
1b | 72 | fb | 75 | |
1c | 4a | fc | 4d | |
1d | 42 | fd | 45 | |
1e | 5a | fe | 5d | |
1f | 52 | ff | 55 | |
38 | 6b | 58 | 68 | |
39 | 63 | 59 | 60 | |
3a | 7b | 5a | 78 | |
3b | 73 | 5b | 70 | |
3c | 4b | 5c | 48 | |
3d | 43 | 5d | 40 |
Variables: IN_A = encoded_byte IN_H = encoded_byte_high_nibble IN_L = encoded_byte_low_nibble OUT_H = decoded_byte_high_nibble OUT_L = decoded_byte_low_nibble to set up the code do the following: IN_A = read_from_file(); IN_H = IN_A & 0xf0; IN_L = IN_A & 0x0f; before exiting: OUT_A = OUT_H | OUT_L;Taking the examples: 0x38 -> 0x6B and 0x39 -> 0x63 It seems like there are two values for the lower nibble, and these seem to be offset by 8, so no matter what the lower value is the higher one is that plus 8. (Look at the table above to confirm this) The use of this value seems to be dependent on the lower bit of IN_A. So the final step is to take the low bit of IN_A and if it is clear to add 0x08 to the output byte.
Original Nibble Output Nibble 0x0,0x1 0x2 0x2,0x3 0x3 0x4,0x5 0x0 0x6,0x7 0x1 0x8,0x9 0x6 0xA,0xB 0x7 0xC,0xD 0x4 0xE,0xF 0x5
Variables: IN_A = encoded_byte IN_H = encoded_byte_high_nibble IN_L = encoded_byte_low_nibble OUT_H = decoded_byte_high_nibble OUT_L = decoded_byte_low_nibble LOOKUP = [2,2,3,3,0,0,1,1,6,6,7,7,4,4,5,5] to set up the code do the following: IN_A = read_from_file() IN_H = (IN_A & 0xf0)>>4 // Get high nibble into low nibble IN_L = IN_A & 0x0f // Isolate low nibble OUT_H = lookup[IN_L]<<4 // To get into high nibble OUT_L = lookup[IN_H] // this is low nibble OUT_A = OUT_H | OUT_L; // merge the two if ((IN_A & 0x01) == 0) // This does the offset on OUT_A = OUT_A + 0x08 // the lower nibble
char lookup[]={2,2,3,3,0,0,1,1,6,6,7,7,4,4,5,5}; int decode_character(int encoded) { if (encoded & 0x01) return( (lookup[encoded&0xf]<<4) + lookup[(encoded&0xf0)>>4] ); else return( (lookup[encoded&0xf]<<4) + lookup[(encoded&0xf0)>>4] +8 ); }
When examination of the decoded files was started, one of the first files
looked at was 'copernic.csf' as this sits in the approot and is named the
same as the application, this was a good choice for master configuration or
some kind of global parameters file.
You should remember from earlier that most lines in the conf files seem to
have a 4 digit number (0011) of varying value at the start of the line. The
example given earlier did not show this as clearly as the following example
hopefully will. This is an instruction for the internal scripting language
to tell it how to handle the rest of the line.
This is the decoded version of 'copernic.csf':
FF01 1 TimeStamp=2001-03-09 00:00:00 0015Register 0011ChannelSet="Ad" 0011ChannelSet3="Ad" 0011Version=2525 0011FileVersion=0 0011SoftwareVersions="eng;engplus;engpro;fra;fraplus;frapro" 0016 0015Init 0011UseCookies=True 1001 0011SearchQuerySeparator="+" 1003 0011Key=SearchQuery 0011RNDSEED="" 0018Length(RNDSEED)<>12 0011RNDSEED=String(Random(99999999)*Random(9999)) 0019 0011T=Random(999999) 0011PromoT=Numeric(Substring(RNDSEED,8,1)) 0011PromoTI=Numeric(Substring(RNDSEED,9,1)) 0011Random100=Numeric(Substring(RNDSEED,10,2)) 0011SourceFLYCAST=Replace("ENG|1|http://ad-adex3.flycast.com/server/_img/Copernic/software/$RANDOMNUMBER$|http://ad-adex3.flycast.com/server/click/Copernic/software/$RANDOMNUMBER$","$RANDOMNUMBER$",String(T)) 0011Source247ENG=Replace(Replace("ENG|1|http://connect.247media.ads.link4ads.com/serv/2/Copernic/ros/468x60/40543;uniq=$RANDOMNUMBER$?$KEY$|http://connect.247media.ads.link4ads.com/click/2/Copernic/ros/468x60/40543;uniq=$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) 0011Source247FRA=Replace(Replace("FRA|1|http://connect.247media.ads.link4ads.com/serv/2/fr-Copernic/ros/468x60/40543;uniq=$RANDOMNUMBER$?$KEY$|http://connect.247media.ads.link4ads.com/click/2/fr-Copernic/ros/468x60/40543;uniq=$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) 0011SourceUFS="UFS|1|http://banner.unifiedweb.com/cgi-bin/getimage.exe/copernic?GROUP=copernic|http://banner.unifiedweb.com/cgi-bin/redirect.exe/copernic" 0011SourceVALUECLICK="VALUECLICK|1|http://kansas.valueclick.com/cycle?host=hs0136917&b=1&noscript=1|http://kansas.valueclick.com/redirect?host=hs0136917&b=1&v=0" 0011SourceVALUECLICKOLD="VALUECLICK|1|http://kansas.valueclick.com/cycle?host=hs0194203&size=468x60&b=indexpage&noscript=1|http://kansas.valueclick.com/redirect?host=hs0194203&size=468x60&b=indexpage&v=0" 0011SourceSERVERFRA4552=Replace(Replace("BANNERSERVER|1|http://bannerpush.copernicserver.com/RealMedia/ads/adstream_nx.cgi/copernicclient/free/fra/recent/$RANDOMNUMBER$?$KEY$|http://bannerpush.copernicserver.com/RealMedia/ads/click_nx.cgi/copernicclient/free/fra/recent/$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) 0011SourceSERVERENG4552=Replace(Replace("BANNERSERVER|1|http://bannerpush.copernicserver.com/RealMedia/ads/adstream_nx.cgi/copernicclient/free/eng/recent/$RANDOMNUMBER$?$KEY$|http://bannerpush.copernicserver.com/RealMedia/ads/click_nx.cgi/copernicclient/free/eng/recent/$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) 0011SourceSERVERFRA4551=Replace(Replace("BANNERSERVER|1|http://bannerpush.copernicserver.com/RealMedia/ads/adstream_nx.cgi/copernicclient/free/fra/old/$RANDOMNUMBER$?$KEY$|http://bannerpush.copernicserver.com/RealMedia/ads/click_nx.cgi/copernicclient/free/fra/old/$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) 0011SourceSERVERENG4551=Replace(Replace("BANNERSERVER|1|http://bannerpush.copernicserver.com/RealMedia/ads/adstream_nx.cgi/copernicclient/free/eng/old/$RANDOMNUMBER$?$KEY$|http://bannerpush.copernicserver.com/RealMedia/ads/click_nx.cgi/copernicclient/free/eng/old/$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) 0012Find("ENGUFS",Edition)<>0 0011SourceUrl=Entry(3,SourceUFS,"|") 0011TargetUrl=Entry(4,SourceUFS,"|") 0013 0012(Find("PLUS",Edition)<>0)or(Find("PRO",Edition)<>0) 0012BuildNumber>4551 0011SourceUrl=Entry(3,SourceVALUECLICK,"|") 0011TargetUrl=Entry(4,SourceVALUECLICK,"|") 0013 0011SourceUrl=Entry(3,SourceVALUECLICKOLD,"|") 0011TargetUrl=Entry(4,SourceVALUECLICKOLD,"|") 0014 0013 0012BuildNumber>4551 0011SelfPromoPercent=0 0013 0012Substring(Edition,1,3)="FRA" 0011SelfPromoPercent=0 0013 0011SelfPromoPercent=10 0014 0014 0012Random1004551 0012Substring(Edition,1,3)="FRA" 0011SourceUrl=Entry(3,SourceSERVERFRA4552,"|") 0011TargetUrl=Entry(4,SourceSERVERFRA4552,"|") 0013 0011SourceUrl=Entry(3,SourceSERVERENG4552,"|") 0011TargetUrl=Entry(4,SourceSERVERENG4552,"|") 0014 0013 0012Random100>54 0012Substring(Edition,1,3)="FRA" 0011SourceUrl=Entry(3,Source247FRA,"|") 0011TargetUrl=Entry(4,Source247FRA,"|") 0013 0011SourceUrl=Entry(3,Source247ENG,"|") 0011TargetUrl=Entry(4,Source247ENG,"|") 0014 0013 0011SourceUrl=Entry(3,SourceVALUECLICKOLD,"|") 0011TargetUrl=Entry(4,SourceVALUECLICKOLD,"|") 0014 0014 0014 0014 0014 0011RotationInterval=120000 0016 11A2
String | COMMAND | Description |
0011 | SET | SET variable=value |
0012 | IF | IF expression THEN |
0013 | ELSE | ELSE |
0014 | ENDIF | ENDIF |
0015 | FUNC | Function Definition Start |
0016 | ENDFUNC | End Function Def |
0018 | WHILE | WHILE expression DO |
0019 | WEND | End While Loop |
FF01 1 TimeStamp=2001-03-09 00:00:00 FUNC Register SET ChannelSet="Ad" SET ChannelSet3="Ad" SET Version=2525 SET FileVersion=0 SET SoftwareVersions="eng;engplus;engpro;fra;fraplus;frapro" ENDFUNC FUNC Init SET UseCookies=True 1001 SET SearchQuerySeparator="+" 1003 SET Key=SearchQuery SET RNDSEED="" WHILE Length(RNDSEED)<>12 SET RNDSEED=String(Random(99999999)*Random(9999)) WEND SET T=Random(999999) SET PromoT=Numeric(Substring(RNDSEED,8,1)) SET PromoTI=Numeric(Substring(RNDSEED,9,1)) SET Random100=Numeric(Substring(RNDSEED,10,2)) SET SourceFLYCAST=Replace("ENG|1|http://ad-adex3.flycast.com/server/_img/Copernic/software/$RANDOMNUMBER$|http://ad-adex3.flycast.com/server/click/Copernic/software/$RANDOMNUMBER$","$RANDOMNUMBER$",String(T)) SET Source247ENG=Replace(Replace("ENG|1|http://connect.247media.ads.link4ads.com/serv/2/Copernic/ros/468x60/40543;uniq=$RANDOMNUMBER$?$KEY$|http://connect.247media.ads.link4ads.com/click/2/Copernic/ros/468x60/40543;uniq=$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) SET Source247FRA=Replace(Replace("FRA|1|http://connect.247media.ads.link4ads.com/serv/2/fr-Copernic/ros/468x60/40543;uniq=$RANDOMNUMBER$?$KEY$|http://connect.247media.ads.link4ads.com/click/2/fr-Copernic/ros/468x60/40543;uniq=$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) SET SourceUFS="UFS|1|http://banner.unifiedweb.com/cgi-bin/getimage.exe/copernic?GROUP=copernic|http://banner.unifiedweb.com/cgi-bin/redirect.exe/copernic" SET SourceVALUECLICK="VALUECLICK|1|http://kansas.valueclick.com/cycle?host=hs0136917&b=1&noscript=1|http://kansas.valueclick.com/redirect?host=hs0136917&b=1&v=0" SET SourceVALUECLICKOLD="VALUECLICK|1|http://kansas.valueclick.com/cycle?host=hs0194203&size=468x60&b=indexpage&noscript=1|http://kansas.valueclick.com/redirect?host=hs0194203&size=468x60&b=indexpage&v=0" SET SourceSERVERFRA4552=Replace(Replace("BANNERSERVER|1|http://bannerpush.copernicserver.com/RealMedia/ads/adstream_nx.cgi/copernicclient/free/fra/recent/$RANDOMNUMBER$?$KEY$|http://bannerpush.copernicserver.com/RealMedia/ads/click_nx.cgi/copernicclient/free/fra/recent/$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) SET SourceSERVERENG4552=Replace(Replace("BANNERSERVER|1|http://bannerpush.copernicserver.com/RealMedia/ads/adstream_nx.cgi/copernicclient/free/eng/recent/$RANDOMNUMBER$?$KEY$|http://bannerpush.copernicserver.com/RealMedia/ads/click_nx.cgi/copernicclient/free/eng/recent/$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) SET SourceSERVERFRA4551=Replace(Replace("BANNERSERVER|1|http://bannerpush.copernicserver.com/RealMedia/ads/adstream_nx.cgi/copernicclient/free/fra/old/$RANDOMNUMBER$?$KEY$|http://bannerpush.copernicserver.com/RealMedia/ads/click_nx.cgi/copernicclient/free/fra/old/$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) SET SourceSERVERENG4551=Replace(Replace("BANNERSERVER|1|http://bannerpush.copernicserver.com/RealMedia/ads/adstream_nx.cgi/copernicclient/free/eng/old/$RANDOMNUMBER$?$KEY$|http://bannerpush.copernicserver.com/RealMedia/ads/click_nx.cgi/copernicclient/free/eng/old/$RANDOMNUMBER$","$KEY$",String(Key)),"$RANDOMNUMBER$",String(T)) IF Find("ENGUFS",Edition)<>0 // if ENGUFS version SET SourceUrl=Entry(3,SourceUFS,"|") SET TargetUrl=Entry(4,SourceUFS,"|") ELSE IF (Find("PLUS",Edition)<>0)or(Find("PRO",Edition)<>0) // PRO or PLUS IF BuildNumber>4551 // BUILD > 4551 SET SourceUrl=Entry(3,SourceVALUECLICK,"|") SET TargetUrl=Entry(4,SourceVALUECLICK,"|") ELSE // BUILD <= 4551 SET SourceUrl=Entry(3,SourceVALUECLICKOLD,"|") SET TargetUrl=Entry(4,SourceVALUECLICKOLD,"|") ENDIF ELSE IF BuildNumber>4551 // BUILD > 4551 SET SelfPromoPercent=0 // clear addshow variable ELSE IF Substring(Edition,1,3)="FRA" // FRENCH SET SelfPromoPercent=0 // clear addshow variable ELSE // ENGLISH SET SelfPromoPercent=10 // set addshow to 10% ENDIF ENDIF IF Random100<SelfPromoPercent // if random < addshow SET SourceUrl=Entry(3,SourceSERVERENG4551,"|") SET TargetUrl=Entry(4,SourceSERVERENG4551,"|") ELSE // if random >= addshow IF BuildNumber>4551 // BUILD > 4551 IF Substring(Edition,1,3)="FRA" // FRENCH SET SourceUrl=Entry(3,SourceSERVERFRA4552,"|") SET TargetUrl=Entry(4,SourceSERVERFRA4552,"|") ELSE // ENGLISH SET SourceUrl=Entry(3,SourceSERVERENG4552,"|") SET TargetUrl=Entry(4,SourceSERVERENG4552,"|") ENDIF ELSE // BUILD <= 4551 IF Random100>54 // if random > 54 IF Substring(Edition,1,3)="FRA" // FRENCH SET SourceUrl=Entry(3,Source247FRA,"|") SET TargetUrl=Entry(4,Source247FRA,"|") ELSE // ENGLISH SET SourceUrl=Entry(3,Source247ENG,"|") SET TargetUrl=Entry(4,Source247ENG,"|") ENDIF ELSE // random <= 54 SET SourceUrl=Entry(3,SourceVALUECLICKOLD,"|") SET TargetUrl=Entry(4,SourceVALUECLICKOLD,"|") ENDIF ENDIF ENDIF ENDIF ENDIF SET RotationInterval=120000 ENDFUNC 11A2So this is a script which seems to control all the adverts, so surely a bit of creative writing is called for. As we already have a decoder we can simply reverse the process to encode the file after we have created the new one.
FF01 1 TimeStamp=2001-03-09 00:00:00The second is this entry at the end of the file, which seems to be a footer of some kind - when first looked at it appears that is possibly some form of CRC.
11A2How about if you are told that the length of this file in HEX is 0x11C4. Another example is a file with 03AC and a file length of 0x3CE.
FF01 1 TimeStamp=2001-03-09 00:00:00 0015Register 0011ChannelSet="Ad" 0011ChannelSet3="Ad" 0011Version=2525 0011FileVersion=0 0011SoftwareVersions="eng;engplus;engpro;fra;fraplus;frapro" 0016 0015Init 0011UseCookies=True 1001 0011SearchQuerySeparator="+" 1003 0011SelfPromoPercent=0 0011SourceUrl="http://127.0.0.1/" 0011TargetUrl="http://127.0.0.1/" 0011RotationInterval=120000 0016 11A2
Looking at the decoded .ssf and .csf files you will see that they share the
same scripting language with a few additions. So the thought was, as it
parses all the files in the set directories and not specific ones, could
a new file or files be added and so add engines and groups to the copernic
engine. This would mean that we are no longer tied to the ones they supply
it would also prove how it works.
Using one of the groups file as an example, the following file was created:
FF01 1 TimeStamp=2001-03-15 00:00:00 0015Register 0011_Conv="4002->3999 (01-03-15, 10:58:42)" 0011DisplayName="Custom" 0011DisplayNames("FRA")="Custom French" 0011DisplayNames("DEU")="Custom German" 0011DisplayNames("ITA")="Custom Italian" 0011DisplayNames("ESP")="Custom Spanish" 0011DisplayNames("POR")="Custom Portugese" 0011Description="Custom Search Group" 0011Descriptions("FRA")="Custom Search Group" 0011Descriptions("DEU")="Custom Search Group" 0011Descriptions("ITA")="Custom Search Group" 0011Descriptions("ESP")="Custom Search Group" 0011Descriptions("POR")="Custom Search Group" 0011ResultsPerChannel=10 0011TotalResults=1000 0011Version=3000 0011FileVersion=1 0011AutoUpdate=True 0011SearchType="keywords" 0016 0015AfterDownload 0016This file was saved as 'Custom.ssf' , encoded using the encode routine and placed in the 'Categories' directory. Now to run the application and see if the group is now in the lists. The puzzling thing was that the group did not appear in the drop down of groups, or the main tab on the left giving all the groups, but if we do a search and then in that screen browse the groups it is there at the bottom of the list. This might be because we have no search engines assigned to this group. When we find the group setting in the category dialog it shows no engines under the group. This is a good sign.
So to create a search engine file, I will use searchlores own Namazu
engine as an example, the following file was created:
FF01 1 TimeStamp=2001-03-09 00:00:00 0015Register 0011_Conv="4002->3999 (01-03-09, 10:52:49)" 0011DisplayName="Namazu" 0011HomePage="http://www.searchlores.org/" 0011SupportNew=True 0011Category="Custom" 0011Version=3000 0011FileVersion=2 0011AutoUpdate=True 0011ChannelSet="Custom" 0011ChannelSet3="Custom" 0011SupportOr=True 0011SupportAnd=True 0011SupportQuotes=True 0016 0015Init 0011SourceUrl="http://www.searchlores.org/cgi-bin/search?query=" 0011ResultsPerPage=20 100A("") 1004("searchlores.org") 0011Rules("Range").StartMarker="Search Results for" 0011Rules("Range").EndMarker="" 0011Rules("Address").Key=True 0011Rules("Title").StartMarker=">" 0011Rules("Title").EndMarker="" 0011Rules("Title").StartLine=0 0011Rules("Title").NbLines=1 0011Rules("Description").StartMarker="" 0011Rules("Description").EndMarker="" 0011Rules("Description").StartLine=0 0011Rules("Description").NbLines=1 0011SearchQuerySeparator="+" 1003 0016 0015BeforeDownload 1001 1002("query="+SearchQuery) 1002("result=normal") 1002("sort=score") 1002("max=20") 0016 0015AfterDownload 0016This file was saved as 'Namazu.csf' , encoded using the encode routine and placed in the 'Categories\Engines' directory. Now to run the application and see if the group is now in the lists.
My aim was not to take the program apart too much, just to get to the data on
the search engines, without spending hours looking at assembler code.
But during this task I have found many things out about how
this program does other things - some are good and some are bad. There is a lot
of hardcoded bits, especially to do with language and syntax (lexicon) which
cannot be updated by updates as it is hardcoded, or at least that is how it
appears to be. I do not like at all the intrusive phone home features of this
product - at least this product uses the proxy you give it for these requests
and does not try to bypass it like some similar products.
I was very disappointed with the encryption on the data files, mind you the
application was coded in delphi. But seriously you would have thought the
developers would have put a bit more in, after all if you are going to
put some encryption in, at least make it worthwhile.
The task was also made a bit easier by the fact that the filenames and directory
structure of the configuration files told you exactly what group or engine
each file related to and what to expect in each
file. It seems like the author wants you to get the data out of
the program, or at least not make our task too hard.
On hindsight (always a good thing) once it had been decided that the
method of encryption was a substitution cipher, if the request URL's from the
proxy server, the strings from the executable and the details
in the groups files were collected it would have
been possible to do a known plaintext attack on the encoded files
and got enough data to recover the encoding method. This would have worked
equally as well as the path I chose to follow, but might have taken a bit
longer - but would have had the same result and without having to even touch
a disassembler or debugger. I
chose to grab the plaintext from the program, so a whole file of plaintext could
be grabbed in one go, and a translation table built easily but a partial plaintext
lookup generator program would have worked equally as well.
The scripting language they have included interested me most ,it has some nice
ideas in it, even though it seems to have its roots in a BASIC type language.
Bot writers and OSLSE project fans should examine this and how
it works to learn many things. It can provide many pointers and ideas to
programmers of VSL's for Bots and other such programs, as it can be very
versatile and is simple in concept
but offers expandability and flexibility. It also seems a lot more flexible
than a simple macro type vsl, where you include commands into strings and
then parse them out, as in webferret. This is not meant to mean that one is
better or worse than the other, but that both are interesting and that it
would be easier to include the webferret idea into this than the other way
around. From looking at it, it would be
very simple to parse and implement because of its defined structure and
the flexibility of being text based and not some form of microcode. This
also makes it very suitable for inclusion in a format such as XML, as an
embedded script.
Firstly I would like to point out that you should try and learn about how your
target works before trying to take it apart, reading the essay you should
hopefully have seen how the clues picked up early on helped later in the
process. While you are installing LOG what the program does. When you run the
program for the first and subsequent times LOG what the program does. These log
files will not cost you anything to make (apart from the time to start filemon
and regmon) and will save you doing it later. Then when a question comes up you
do not have to think - oh I must uninstall and reinstall to get a log of every
change - not all may be removed or put back on - it depends on the program. So
do it the first time. Pick your target and work it, right from the start.
After the script code I realise that I was trying to over complicate matters
and produce some fancy parsing macro type thing for the parsing part of
my bot, seeing this has brought me back to a simple but very expandable
idea, which will be much easier to implement and expand as development
requires. Sometimes it takes seeing another point of view to bring some
clarity to your thoughts and put you back on the right track.
If you are going to write a paper on a subject you normally would research
other works on the same subject first, surely the same should be done if
you are working on some software. This might save you from reinventing the
wheel as a square. I am not saying use their ideas exactly as they do,
but you should observe and learn from them, then create a solution which
brings all the parts most suited to your task together.
I would also like to point out that people tend to download and use software
without really understanding what it does, or what data about them goes where.
You should take care of what software you use and should understand the
hidden datas that they send about you. A prime example is the entry in the
advert request in this product which gives them what you are searching for,
quite apart from the update and regcard information. Most products of this
type seem to conduct this form of activity and the users should be made
aware of this before using the products.
The use of adverts in products is actually robbing, yes robbing the users
of their precious bandwidth, while they are showing adverts you are loosing
bandwidth and I believe that reducing the advert shown to a 1x1 image or
simply hiding the advert is not a solution as you are still using bandwidth
the only proper method of advert removal is to make sure the request never
gets out, or at least not as far as your internet connection.
I must point out that during the writing of this essay, at no point was Copernic
allowed to interact with the internet in any way shape or form. It has now
been removed from the PC it was installed on and will not be returning.
A lot information was gained from log files, and some reversing of course! ;).
Hope you enjoyed reading.
Copyright (c) 2001, WayOutThere
|
|