Part 5.a: Implement flow-inspector on CentOS6

May 2nd, 2013: Lothar Braun, one of the original authors of the paper/software, has written his own writeup explaining how to push from VERMONT to the redis queue for flow-inspector.

 

Read the paper:

Seriously. Stop now and read the paper.

Install backend stuff (a few are my preference):

echo [fedora_repo] >> /etc/yum.repos.d/fedora_repo.repo #allow yum access to the fedora repo
echo name=fedora_repo >> /etc/yum.repos.d/fedora_repo.repo
echo baseurl=http://download1.fedora.redhat.com/pub/epel/\$releasever/\$basearch/ >> /etc/yum.repos.d/fedora_repo.repo
echo enabled=1 >> /etc/yum.repos.d/fedora_repo.repo
echo skip_if_unavailable=1 >> /etc/yum.repos.d/fedora_repo.repo
echo gpgcheck=0 >> /etc/yum.repos.d/fedora_repo.repo
yum -y install make wget vim cronie crontabs bind-utils git gcc gcc-c++ cmake curl-devel python-devel python-pip redis libpcap libpcap-devel pcre pcre-devel mysql mysql-devel mysql-server
wget http://www.python.org/ftp/python/2.7.4/Python-2.7.4.tgz #needed for argparse module used in preprocessor.py
tar zxvf Python-*
cd Python-*
./configure && make && make install
cd
wget http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11-py2.7.egg
sh setuptools-0.6c11-py2.7.egg
curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py
python get-pip.py
pip-2.7 -v install distribute paste jinja2 ujson redis mysql-python pymongo

Start redis:
redis will bind to 127.0.0.1:6379 by default.

chkconfig --level 345 redis on
service redis start

Configure SELinux:
For now, disable:

setenforce permissive
sed s/SELINUX\=enforcing/SELINUX\=permissive/ -i /etc/sysconfig/selinux

Add exception to firewall for web UI:
Edit the iptables files adding a jump to ACCEPT for tcp port 8080.

Secure your mysql instance:

chkconfig mysqld on
service mysqld start
/usr/bin/mysql_secure_installation
#enter for no root password
#Y to set root password (m7sguil)
#enter a password
#Y to remove anonymous user
#Y to disallow root login remotely
#Y to remove test database and access to it
#Y to reload privilege tables now
#done

Create flow-inspector DB and create user:
Please change flowinspector_DBUSER_PASSWORD to something secure.

cd
echo "CREATE DATABASE flowinspector;" >> ~/flowinspector.sql
echo "GRANT ALL PRIVILEGES ON flowinspector.* to 'flowinspector'@'localhost' IDENTIFIED BY 'flowinspector_DBUSER_PASSWORD';" >> ~/flowinspector.sql
echo "FLUSH PRIVILEGES;" >> ~/flowinspector.sql
mysql -p < ~/flowinspector.sql
shred -u -z -n 30 ~/flowinspector.sql

Download flow-inspector:

cd
git clone http://github.com/constcast/flow-inspector.git

Configure flow-inspector:
Additional information on options can be found on the project wiki

Please change flowinspector_DBUSER_PASSWORD to something secure.

We will be using MySQL as our backend:

cd
cd flow-inspector
cp ./config/config.default.py ./config/config.py
chmod 660 ./config/config.py
sed s@"debug = True"@"debug = False"@ -i ./config/config.py
sed s@"db_backend = \"mongo\""@"db_backend = \"mysql\""@ -i ./config/config.py
sed s@"db_port = 27017"@"db_port = 3306"@ -i ./config/config.py
sed s@"db_user = None"@"db_user = \"flowinspector\""@ -i ./config/config.py
sed s@"db_password = None"@"db_password = \"flowinspector_DBUSER_PASSWORD\""@ -i ./config/config.py
sed s@"db_name = \"flows\""@"db_name = \"flowinspector\""@ -i ./config/config.py

Leave the following settings as is, but get familiar with the contents:

  • Original Flow DB
  • Destination Data Backend
  • Flow settings
  • Preprocessor

Processing live data from the redis queue:
To flush the cache to the backend DB, the setting `live_import` must be set to `True` in config.py. This enables the preprocess.py to to flush the redis queue to the backend DB before the queue receives an entry string “END”, at the schedule of 10 minutes, subsequently 5 minutes, until the process is stopped.

Start the system:

cd
cd flow-inspector
nohup python ./app/app.py > /dev/null &
nohup python ./preprocess/preprocess.py > /dev/null &

preprocess.py will output statistic reports every 10 seconds. You can `nohup` `&` this as well. Or drop it to the background then disown it while it is running (watch out for data loss during this time).

Push flow data to flow-inspector:

flow-inspector

The general data flow from packets to the flow-inspector web UI.

Monitoring the redis queue:
To monitor the contents of the redis queue, run the following:

redis-cli monitor | awk '{print strftime("%F %H:%M:%S output: ", $1),$0}'

Option 1: Configure VERMONT to take incoming packets and send IPFIX to the redis queue:
In my instance, I will be using Vermont as my IPFIX probe, as it can push natively to the redis queue using a VERMONT module that is currently in development called ipfixFlowInspectorExporter. There is a separate write up focused on implementing the VERMONT probe.

Option 2: Configure use rasqlinsert from the argus project:
Previously, I ventured to use argus as a probe, rasqlinsert to insert argus records into a DB, then import_db_to_redis.py. Lothar Braun extended import_db_to_redis.py to include an argus DB source at my humble request (thanks!).

As Lothar had stated to me, preprocess.py requires a DB with at least the following tables:

+--------+-----------------------+------+-----+---------+-------+
| Field  | Type                  | Null | Key | Default | Extra |
+--------+-----------------------+------+-----+---------+-------+
| stime  | double(18,6) unsigned | NO   |     | NULL    |       |
| dur    | double(18,6)          | NO   |     | NULL    |       |
| proto  | varchar(16)           | NO   |     | NULL    |       |
| saddr  | varchar(64)           | NO   |     | NULL    |       |
| sport  | varchar(10)           | NO   |     | NULL    |       |
| daddr  | varchar(64)           | NO   |     | NULL    |       |
| dport  | varchar(10)           | NO   |     | NULL    |       |
| spkts  | bigint(20)            | YES  |     | NULL    |       |
| dpkts  | bigint(20)            | YES  |     | NULL    |       |
| sbytes | bigint(20)            | YES  |     | NULL    |       |
| dbytes | bigint(20)            | YES  |     | NULL    |       |
+--------+-----------------------+------+-----+---------+-------+

To target a specific table, use the `–table-name` argument.

Use cron to schedule imports from the database, with the run line:

import_db_to_redis.py --argus-db --src-host [argus mysql/postgresql db host] --src-port [argus mysql/postgresql db host port] --src-user [argus mysql/postgresql db user] --src-password [argus mysql/postgresql db user password] --src-database [argus mysql/postgresql db name]

Option 3: Configure VERMONT to take incoming packets and send IPFIX flows to a DB:
For this, you want to use the VERMONT module IpfixDbWriter.

Use cron to schedule imports from the database, with the run line:

import_db_to_redis.py --src-host [vermont mysql/postgresql db host] --src-port [vermont mysql/postgresql db host port] --src-user [vermont mysql/postgresql db user] --src-password [vermont mysql/postgresql db user password] --src-database [vermont mysql/postgresql db name]

Read the paper again:

Seriously. Stop now and read the paper again.

 
Additional info…
Here is some additional info on flow-inspector.

DELETE anything older than a month:
Since I run argus along side flow-inspector, I wish to same drive space and save some processing power by deleting records so that only the previous month is displayed in the web UI. I considered archiving, then realized that there is relatively no need to refer to the flow data in this manner that is older than a month. If you wish to, you can always use the argus-import script to regenerate the data in the tables from argus data.

You can use an EVENT to perform this task (following the included example for flow-inspector).

Sending data to the backend DB:
In order to understand this a bit better, it’s useful to understand the redis queue.

The target redis queue is a queue named “entry:queue” which accepts flow data in JSON strings.

The JSON strings contain the fields that are defined in flow_aggr_values and flow_aggr_sums.

If the queue receives a redis message whose string is “END”, or the python instance hosting preprocess.py receives a SIGTERM, then preprocess.py will flush data from memory to the backend DB.
There is a pending issue to better manage the in memory cache (in some manner), likely flushing data to the backend DB at some threshold.

Extending import:
The diagram at the beginning of this page covers the general flow of data from packets to web UI.
The redis queue can be consider the stable entry point of data.

Lothar recommends taking a look at ./lib/importer_modules.py and adding code to ./preprocess/import_db_to_redis.py to process plain text data.

Additionally, understanding more about the structure that is expected of the redis queue would allow you to author clients that would populate the redis queue directly (as Vermont does).
A possible interesting point for this would be an ra* client for argus (“raredis” for instance). I dropped an Email to Carter Bullard of qosient (the author of argus) about Lothar’s inclusion of rasqlinsert populated DBs, and received no response.

Extending the final product:
I’d love to sit down and dig deeper into the web UI and backend and implement GeoIP data some how.
The IPFIX specification doesn’t natively support the geoIP location information. This makes sense, as it should be up to a client to handle this.
argus’s client set has this feature, previously covered, and it would be great to populate the backend DB with this additional information, then create a heatmap style extension to the flow-inspector web UI.

Adapting flow-inspector:
You can adapt flow-inspector by managing some of the “Flow settings,” such as flow_aggr_values and flow_aggr_sums.

Alternates to single-threaded python HTTP server:

There are several backends to the Bottle framework as covered within the bottle documentation. The flow-inspector devs have included a run() for Paste that can be un-commented at the bottom on ./app/app.py.

External references:

  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: