Installation Instructions for Scrutinizer
*****************************************


This document will describe the needed steps for installation of the
Scrutinizer System.

The needed libraries and additional tools will be described here too.


Requirements:
-------------

If you just want to analyze old logfile you simply need:

- Perl 5.6 or higher - gnuplot - Posix conform OS


Optional Requirements:
----------------------

If you want to use mod_scrutinizer as an additional defense layer,
to block wrongdoers on the webserver (apache), you need:

	- gcc - Apache 2.X or higher

If you want to use the visualization tools you need:

	- Curses::UI Perl module for visualisation tools

	For the RRD Viewer you need additionaly:

	- rrdtool with Perl bindings (RRDs.pm)

If you want to use the nice LIVE mode you need:

	- File::Tail Perl module


Extracting:
-----------

The software is distributed as a gzipped tarball. Extract the tarball
and copy the content into /usr/local

Now this directory contains the following structure:


/usr/local/scrutinizer
|
|-- INSTALL            - this file
|-- LOG                - here the logfile will be created
|-- README             
|-- SM                 - Perl modules needed by scrutinizer
|   |-- Blacklist.pm
|   |-- Conf.pm        - the configuration file (edit this)
|   |-- Defs.pm
|   |-- GV.pm
|   |-- HashTable.pm
|   |-- Init.pm
|   |-- Loop.pm
|   |-- PP.pm
|   |-- Parameters.pm
|   |-- Plot.pm
|   |-- RRD.pm
|   |-- Scrutinizer.pm
|   |-- Statistic.pm
|   `-- Tools.pm
|-- mod_scrutinizer    - optinal Apache module for scrutinizer
|   |-- 420.html
|   |-- blacklist.c
|   `-- mod_scrutinizer.c
|-- tools              - directory with some more tools in it 
|-- run                - runtime directory for pid-files and pipes
|-- scc                - scrutinizer command console (executable)
|-- scrutinizer        - the scrutinizer daemon (executable)
`-- srrdv              - scrutinizer rrd viewer (executable)


Optional Apache module mod_scrutinizer:
---------------------------------------

The main parts of the system don't need to be compiled, because they're
entirely written in perl. Only the optional Apache module and its helper
application need to be compiled.

For compiling the helper application just type:

$ gcc -o blacklist blacklist.c

For compiling the Apache module the easiest way is:

$ apxs2 -c -i -a mod_scrutinizer.c

This will compile the module and copy the created sharable object
file in the library directory of the Apache webserver.	The default:
/usr/lib/apache2/ if you have a custom installation, the modules need
to be placed into apache/modules.

That's was all about compiling.

After compiling and moving the module to its destination the
httpd.conf needs to be adjusted. The following directives are needed
by mod_scrutinizer:

If you've installed the scrutinizer into a custom directory, please
adjust the paths!

- ScrutinizerPipeFile		  "/usr/local/scrutinizer/run/bad_ip" This
is where the named pipe will be created by the helper application. If you
want to include mod_scrutinizer into the system, you'll have to configure
the scrutinizer such that the alert command writes into this named pipe.

- ScrutinizerPidPath		  "/usr/local/scrutinizer/run" This is
where the pid-file will be created by the helper application. The name
of the pid-file consists of the filename of the helper application and a
'.pid' extension.

- ScrutinizerApplication	  "/usr/local/mod_scrutinizer/blacklist"
This is where the helper application is located.

- LoadModule scrutinizer_module modules/mod_scrutinizer.so Finally tell
Apache that he should load the module.

- ErrorDocument 420 /420.html Here a custom error document can be
specified. This is helpful for the user, because mod_scrutinizer returns
the status code 420 if an ip is on the blacklist. The status code is not
yet used and thus it's not specified in the rfc2616. With a custom error
document, the user can be informed why he's not able to access the site.

Now you need to copy the 420.html from the mod_scrutinizer directory to
apache/htdocs or where ever you specified it in the httpd.conf.


Configure Scrutinizer:
----------------------

Now it's time to configure the analysis engine. For this purpose open the
Conf.pm located in scrutinizer/SM. Inside the config file the specific
settings are well documented. Please read them carefully and customize
the system to your taste.


Start the System:
-----------------

To bring the hole system up is quite easy (if the configuration is done
well). You simply type /usr/local/scrutinizer/scrutinizer and it will
start to process the defined $DEF_LOGFILE. If you've configured the 
system to go in LIVE* mode, the scrutinizer reads (using File::Tail) 
continously from the access_log of your Apache webserver and does his job.

If you've not installed File::Tail, but you would like to use the LIVE
mode anyway there's a hack possible.

Configure the scrutinizer to DEAD* mode and let him read from standard
input (argument -). This can be done with the following command line:

$ tail -f /var/log/apache2/access_log | ./scrutinizer -


*LIVE/DEAD Mode:
There exists two different modes in which the system can be runned.
The DEAD Mode is used to do the training, or to analyze old logfiles.
In LIVE mode the log file is read continiously, this means that just
new entries will be read as soon as they are written to the logfile.


Train the System:
-----------------

Before the scrutinizer fits to your special environment, it's necessary
to train the systems tresholds. This can be done in the training
mode. Depending how you archive your logfiles there are some little
differences.

Example domain_access_log-20040101.gz:

domain_access_log-20040101.gz contains the access_log of the first
january 2004. To train the system with these data, you've to do the
following steps:

1.) configure DEAD mode
	$MODE=DEAD;

2.) configure training mode
	$DO_TRAINING=1;

3.) analyze the training data
	$ zcat domain_access_log-20040101.gz | scrutinizer -

Now the scrutinizer analyses all requests containing the access_log. New
tresholds will be calculated which are typic for this explicit webserver.


Watch it working:
-----------------

If you succeded to train and start up the system properly,
congratulations! Now you can look the scrutinizer over the shoulder with
the visualization tools. Like mentioned above, the Perl module Curses::UI
is absolutely needed by these tools. Please make sure, that this module
is installed properly, otherwise you won't be able to start them up.

- the scrutinizer command console (scc)

  with scc you can look into the current datastructure of the analysis
  engine. All current sessions including the statistical values and
  the request history can be displayed. Also a list of the current
  blacklisted hosts is available. It's possible to manually ban and
  unban specific hosts.

- the scrutinizer round robin database viewer (srrdv)

  with srrdv you've access to the round robin database of the
  scrutinizer. In this database all key data is stored over a long range
  of time. Informations like the average alert level of the last week
  can be found here. The round robin database is an optional feature of
  the scrutinizer. You can activate it with the configuration directive
  $DO_RRD_STAT=1, but you must have installed the rrdtools (including
  RRDs.pm) properly.


Logfiles:
---------

The working scrutinizer creates a bunch of logfiles, including details
about the blocked hosts (debug_IP - files), statistical logfiles of
the different analysis functions (stat-X.log) and the most important,
the lists of the function values a client got, when it exceeds a
alert limit (alerts.logX where X is the alert level 1-2).

All logfiles are stored in the LOG directory (default:
/usr/local/scrutinizer/LOG).

The round robin database containing the key data of the scrutinizer is
stored in the LOG directory too (scrutinizer.rrd).
