Posts Tagged ‘flow’

My bleeding heart: Dear argus, I miss you.

April 9, 2014 Leave a comment

Since I started a new job, I’ve got a lot of stuff to master before I revisit implementing flow data.

With all the Heartbleed reaction craze, I noticed that some Snort defs were released the other day, and that means there are likely IOCs that can be found in historical flow data.

Carter looks like he’s going to start a write up shortly, so keep an eye on the mailing list.


How to build an raservices().conf file effectively

July 17, 2013 Leave a comment

After further investigation into the nDPI libs, it became clear that there was very little data to pull byte patterns out. A majority of the definitions consider MANY more aspects to be essential to classifying a flow.

Therefore, to actually generate an raservices().conf file effectively, I would say get a very large data set:
1) replay it against nDPI
2) replay it against libprotoident
3) replay it against rauserdata() -M printer=”encode32″

You will then be able to align protocol definitions.

There is no reason why efforts can’t be cumulative. As far as Carter is concerned, I’m sure he’d be happy to append a larger std.sig file to the distro.

So, although it was fun, it became clear that my work was going to fail to reach the goal at the reliability strength I had wished.

Keeping an eye out for stable ntopng

June 26, 2013 Leave a comment

It appears that I missed the partial desublimation of ntopng, a newer version of ntop. I’ve yet to build it, but it looks great. Maybe it will replace the unstable flow-inspector that I am currently using. Unfortunately, I can not give it the time it deserves to contribute code at this point.

I will be keeping an eye on the ntop user’s mailing list that I am subscribed to via feedburner.

Anomaly detection methodology with argus data

June 21, 2013 Leave a comment

Monitoring policy packet and byte hit counts on a Fortigate via SNMP

June 14, 2013 1 comment

Within the MIB for the Fortigate, there are two OIDs that contain the policy hit counts:

Number of packets matched to policy (passed or blocked, depending on policy action). Count is from the time the policy became active. = policy packet count for policy ID P, in VDOM V
Number of bytes in packets matching the policy. See fgFwPolPktCount. = policy byte count for policy ID P, in VDOM V

I just created a DENY policy for a variety of geographic regions, a feature of the Fortigate. Although I am also monitoring destination country code information with argus, I have not yet integrated argus into an IDS platform. Before I do this, I can quickly set up a icinga/nagios service to query this value and report when it increases above 0. I am logging policy violations within the Fortigate so that I can quickly review the source, revert to argus, then to the workstation itself.

Effectively parsing PCAPs with python’s querycsv

Here is a quick method I used today, before implementing an argus probe in a sustainable way, to create and parse PCAPs to determine high bandwidth offends.

Create PCAPs:

ifconfig eth0 promisc
netstat -i | grep eth #check for the P flag
mkdir /pcaps/
nohup tcpdump -i 1 -w /pcaps/pcap -C 20 -W 250 'host' -Z root & #not really secure, look at -Z

Note that doesn’t like _underscores_, -dashes-, and files that contains only numbers.

This will generate /pcaps/pcapNNN…

Process PCAPs on a Windows box with tshark.exe:

echo echo ipsrc,ipdst,dstporttcp,srcporttcp,len ^> %1.csv > %temp%\tshark_len.bat
echo ".\Wireshark\tshark.EXE" -r %1 -T fields -e ip.src -e ip.dst -e tcp.dstport -e tcp.srcport -e frame.len -E separator=, -E quote=d ^>^> %1.csv >> %temp%\tshark_len.bat
for /r . %G in (pcap*) do %temp%\tshark_len.bat %G

`tcpdump` has relatively the same syntax.

Query CSVs with SQL statements:
Use a python module to quickly return calculations (

pip install querycsv

Per-file operations
Find the total bytes incoming to host: -i test.csv "select ipsrc, ipdst, dstporttcp, dstporttcp, srcporttcp, sum(len) as 'len_in_bytes' from test group by ipsrc"

Find the total bytes outgoing from host: -i test.csv "select ipsrc, ipdst, dstporttcp, dstporttcp, srcporttcp, sum(len) as 'len_in_bytes' from test group by ipdst"

For all files:

echo ipsrc,ipdst,dstporttcp,srcporttcp,len > mergedpcap.csv
cat *.csv | grep -v "ipsrc,ipdst,dstporttcp,srcporttcp,len" >> mergedpcap.csv

Find the total bytes incoming to host: -i mergedpcap.csv "select distinct dstporttcp, ipsrc, ipdst, sum(len) as 'len_in_bytes' from mergedpcap group by ipdst" > incoming.log
[/source ]
The record where ipsrc is the targeted host (in this case, will return the TOTAL length of all packets sent from the targeted host.  (all uploadeded, yes UPloaded, bytes)

Find the total bytes outgoing from host: -i mergedpcap.csv "select distinct dstporttcp, ipsrc, ipdst, sum(len) as 'len_in_bytes' from mergedpcap group by ipdst" > outgoing.log

The record where ipdst is the targeted host (in this case, will return the TOTAL length of all packets sent to the targeted host. (all downloaded, yes DOWNloaded, bytes)

Other solutions:
You’ll notice that querycsv first imports the csv data to an in memory sqlite3 db. This makes offering a full set of sql queries and functions trivial.

There exists other options to solve this sort of situation:
1) PCAP to SQL platforms: pcap2sql and c5 sigma.
2) SQL querying PCAP directly: PacketQ (which lacks some SQL queries and functions, see here). Here is a neat example of displaying some results.
3) robust solutions like pcapr-local, with integration to mu studio.

Understanding a Vermont config file

April 30, 2013 Leave a comment

Documentation summary:
I will be summarizing the documentation available within the project’s github wiki to bring a simple understand of a fairly complex configuration process.

For our example we are concerned with setting up Vermont to be an IPFIX generator, receiver, and transmitter to a local redis queue “entry:queue.”

Note that as of April 30th, 2013, there is no redis queue output module for vermont. I have reached out to Lothar to find out more about this possibility. Until then, since our goal is to configure vermont to be a probe to populate flow-inspector’s backend DB, I will be using Vermont to receive PCAP and output to a database. However, I will cover outputting IPFIX packets using the ipfixExporter module. See the bottom of this page for more output options, including IpfixDbWriter which I will be using for my flow-inspector instance.

Working with the modular design:
Vermont config files operate like our friends’ nsclient++, nxlog, and rsyslog config files, in a modular design.

The data flow is decided upon by the “id” of each module (declared in the module root xml node), and the “next” child node of that module root node.

Multiple “next” tags can be included, but there are special stipulations for this. Translating docs from developer speak to admin speak:
1) Each receiving module will see the same packets, not a copy.
2) So… if a module modifies the data, then it is modified. This can be confusing for the other modules.
If multiple receivers are used, then a queue can be used. This allows the modules to do work in a synchronous manner.

Configuring an IPFIX generator:
We’ll be working off a file ipfix-export.xml and I will describe simply non-obvious portions.

A lot of these are defined in code: ./src/modules/ipfix/*.h

  • The root node is the name of the config.
  • “sensorManager”: use the “checkInterval” setting to control how many frequently (in seconds) sensors are polled.
  • “observer” (Input type: none, Output type: Packet): takes input from pcap interface.
  • “packetQueue” (Input type: Packet, Output type: Packet): holds packets in queue, up to child node “maxSize”, until the “next” module is ready. If full, pauses previous module. The large maxSize, the more RAM is utilized.
  • “packetAggregator” (Input type: Packet, Output type: IpfixRecord): takes incoming packets and makes IPFIX records out of them, using the provided settings:
    • “rule”: defines scope
      • “templateId”: IPFIX Template ID.
      • “flowKey”: Parent node of ieName… that includes in aggregation? **I don’t understand the wiki’s definition: “Flow key information element – flows are aggregated according to those keys.”
      • “nonFlowKey”: Parent node of ieName… that excludes from aggregation? **I don’t understand the wiki’s definition: “Non-flow key information element – those IEs are aggregated.”
      • “ieName”: IPFIX information elements
      • “match”: matches the ieName
        • protocolIdentifier: “TCP”, “UDP”, “ICMP”, or IANA number (for IPv4 RFC791, for IPv6 RFC2460).
        • (sourceIPv4Address, destinationIPv4Address, ipNextHopIPv4Address, bgpNextHopIPv4Address, sourceIPv4Prefix, destinationIPv4Prefix, mplsTopLabelIPv4Address, exporterIPv4Address, collectorIPv4Address, postNATSourceIPv4Address, postNATDestinationIPv4Address, staIPv4Address), (sourceIPv6Address, destinationIPv6Address, ipNextHopIPv6Address, bgpNextHopIPv6Address, exporterIPv6Address, mplsTopLabelIPv6Address, destinationIPv6Prefix, sourceIPv6Prefix, collectorIPv6Address, postNATSourceIPv6Address, postNATDestinationIPv6Address) (not clear if supports ipv6): for IPv4 use CIDR notation, for IPv6 see RFC5952.
        • udpSourcePort, udpDestinationPort, tcpSourcePort, tcpDestinationPort, sourceTransportPort, destinationTransportPort, collectorTransportPort, exporterTransportPort, postNAPTSourceTransportPort, postNAPTDestinationTransportPort: an unsigned16 that represents a destination port (for UDP see RFC768, for TCP see RFC793, for SCTP see RFC2960, for NAPT see RFC3022. Note that a port range is can be defined as [start port]:[end port].
        • tcpControlBits: “URG”, “ACK”, “PSH”, “RST”, “SYN”, “FIN”. Combine in a comma-separated list.
    • “expiration”: defines scope
      • “inactiveTimeout”: timeout for inactive flows.
        • “unit”: “sec”, “msec”, “usec”.
      • “activeTimeout”: timeout for long-lasting active flows.
    • “pollInterval”: interval when flow should be passed to next module.
  • “ipfixQueue” (Input type: IpfixRecord; Output type: IpfixRecord): holds IPFIX records in queue until the next module is ready to process them.
  • “ipfixExporter” (Input type: IpfixRecord, Output type: none): module that sends IPFIX outbound to the network.
    • “observationDomainId”: IPFIX spec for observation domain as defined in RFC5153
    • “transportProtocol”: aka “exportTransportProtocol” in the IPFIX spec. Accepted values in the code: “17” and “UDP”, “132” and “SCTP”, “DTLS_OVER_UDP”, “DTLS_OVER_SCTP”, “TCP”

Configuring an IPFIX to MySQL DB receiver:
You should now understand the concept of the modular design, and by referring to the available module list and their input and output specs, should understand how to implement modules effectively.

Here is what we generally want:

packets -> NIC of Vermont probe -> take certain fields -> DB

Knowing the list of available modules, we already know how to configure Vermont to be an IPFIX receiver to MySQL DB receiver:

observer -> packetqueue -> packetaggregator -> ipfixqueue -> ipfixdbwriter

Outputting to stdout:
With the modules and some cli-fu:

observer -> packetqueue -> packetaggregator -> ipfixprinter -> pipe -> [data processing script] -> redis API or `redis-cli -x`

The challenge here is that the output of ipfixprinter to stdout is not engineered to be input to `redis-cli -x`, so a script or program to process the data is necessary.

Tuning for performance:
As evidenced in the above example, queuing becomes sort of an arbitrarily important part of your probe config design. The queues are memory mapped, so this is a consideration, low RAM = not so large queues; while saturated CPU would mean you need a larger queue. Some monitoring and tweaking would need to be done in this area.

Four facts should be gained: incoming packet flow, process RAM utilization, process CPU utilization, outgoing packet flow. This will give you very basic information on if you can tweak your queues better. Remember, unpredictably large packet flow might prove difficult to engineer for.

%d bloggers like this: