Usage

trafficdatasetmaker

The usage command for the trafficdatasetmaker tool is as given below

usage: trafficdatasetmaker [-h] [-v] -i INPUTFILE -o OUTDIR
                       [-t {packets-csv,pcap}] [-f METADATAFILE]
                       [-d {packets,pdus,all}] [-p {heuristic}]
                       [-a {port_numbers}] [-c {dbscan}] [-e]

**required arguments**:

-i INPUTFILE, --inputfile INPUTFILE
specify pcap or packets-csv file to be modeled

-o OUTDIR, --outdir OUTDIR
output directory to save csv files

**optional arguments**:

-h, --help
show this help message and exit

-v, --verbose
log verbose to console

-t {packets-csv,pcap}, --inputfiletype {packets-csv,pcap}
input file type, if 'packets-csv', CSV file must be formated according
to the out put of thepcap2csv utility

-f METADATAFILE, --metadatafile METADATAFILE
metadata json file with experiment description

-d {packets,pdus,all}, --datasets {packets,pdus,all}
type of dataset to generate

-p {heuristic}, --pdu-detection-method {heuristic}
method to use to classify packets into pdus

-a {port_numbers}, --app-classification-method {port_numbers}
method to use to classify packets into apps

-c {dbscan}, --connection-cluster-method {dbscan}
method to use to cluster connections to classes

-e, --skip-extra-calculations
do not calculate "user-sessions, connection-clusters, connection-
pools and request bursts

The library can also be used within python with the code below:

from trafficdatasetmaker.trafficdatasetmaker import Trafficdatasetmaker
csvmaker = Trafficdatasetmaker(inputfile, outdir,
                                inputfiletype, metadatafile,
                                datasets, pdu_dtcn_method,
                                app_class_method,
                                conn_cluster_method,
                                skip_extra_calculation)
csvmaker.makecsvs()

Note that the metadata file argument is optional.

The output of the processes described above is a set of csv files, a log file and a metadata file containing useful information about the dataset that has been created.

trafficdatesetmodifier

In addition to the trafficdatasetmaker command this application also installs a utility command called trafficdatasetmodifier. The trafficdatasetmodifier is an additional utility for making modifications to an existing trafficdatasetmaker output dataset. The expected input metadata file is also the output of the trafficdatasetmaker after it has been adjusted based on changes to be made on the dataset. The usage description is given below

usage: trafficdatasetmodifier [-h] [-v] -i INPUTDIR -o OUTDIR
                            [-f METADATAFILE]

**required arguments**:

-i INPUTDIR, --inputdir INPUTDIR
specify pcap or packets-csv file to be modeled

-o OUTDIR, --outdir OUTDIR
output directory to save csv files

-f METADATAFILE, --metadatafile METADATAFILE
adjusted metadata json file containing details of
changes that should be made to be made to dataset

**optional arguments**:

-h, --help
show this help message and exit

-v, --verbose
log verbose to console

Output CSV Headings Description

trafficdatasetmaker creates multiple csv file outputs in the specified outdir directory. Depending on the values specified in the datasets argument, it can generate up to 4 files.

  1. all_pkts.csv

In addition to all the fields already described in the output of the pcap2csv app, the traffic dataset maker also generate the fields below:

app_idx                 The identified application label
rel_time                The packet time delta from beginning of the
                        capture
user_sess_idx           The identified user session index to which
                        the packet belongs
conn_class_idx          The detected connection class cluster that
                        the packets connection belongs to
conn_pool_idx           Unique identifier for packets of
                        connections found to the same connection
                        pool
req_burst_idx           Unique identifier for packets found to
                        belong to the same request burst PDU.
l2_pair_id              Unique ID for packets exchanged between
                        pairs of L2 endpoints
l3_pair_id              Unique ID for packets exchanged between
                        pairs of L3 endpoints
l4_pair_id              Unique ID for packets exchanged between
                        pairs of L4 endpoints
srv_endpoint            The combination of the server endpoint IP
                        address and port number for the packet
client_ip               The IP address of the client side for the
                        packet
l4_sess_pdu_idx         The ID of the PDU to which a packet is
                        detected to belong to, unique within a TCP
                        or UDP session
global_pdu_idx          The ID of the PDU to which a packet is
                        detected to belong to, unique across the
                        entire packet capture
  1. all_pdus.csv

all_pdus.csv contains the same columns as seen in the all_pkts.csv. However, the values are for each protocol data unit (PDU) detected in the traffic traces.

  1. pdu_compare.csv

This contains a condensed form of all_pdus.csv with only the following columns below:

global_pdu_idx, rel_time, l4_type, app_idx, l4_sess_idx,
l4_sess_pdu_idx, global_pdu_idx, l3_src_ip, l4_src_port,
l3_dst_ip, l4_dst_port, d_size, l4_from_client, user_sess_idx,
conn_class_idx, conn_pool_idx, req_burst_idx, protocol_data,
l3_pair_id, l4_pair_id
  1. all_conns.csv

Contains an entry for each TCP/UDP session seen in the traffic traces. The columns are:

l4_sess_idx             Unique index for packets found to belong in
                        the sameTCP or UDP connection.
l4_type                 Layer4 type of the packet 'tcp' or 'udp'
                        only
app_idx                 The identified application label
cli_ip                  The IP address of the client in the
                        connection
srv_ip                  The IP address of the server in the
                        connection
cli_port                The port number of the client in the
                        connection
srv_port                The port number of the server in the
                        connection
rel_start_time_cap
duration                duration of the connection, between first
                        and last packets
num_pkts                Total number of packets in the connection
cli_num_pkts            Total number of packets from the client
                        side in the connection
srv_num_pkts            Total number of packets from the server
                        side in the connection
num_pdus                Total number of protocol data units(PDUs)
                        in the connection
cli_num_pdus            Total number of protocol data units(PDUs)
                        from the client side in the connection
srv_num_pdus            Total number of protocol data units(PDUs)
                        from the server side in the connection
total_bytes             Total number of bytes exchanged within the
                        connection
cli_total_bytes         Total number of bytes from the client side
                        of the connection
srv_total_bytes         Total number of bytes from the server side
                        of the connection
user_sess_idx           The identified user session index to which
                        the packet belongs
conn_class_idx          The detected connection class cluster that
                        the packets connection belongs to
conn_pool_idx           Unique identifier for packets of
                        connections found to the same connection pool