|
Packit |
1422b7 |
Lognormalizer
|
|
Packit |
1422b7 |
=============
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Lognormalizer is a sample tool which is often used to test and debug
|
|
Packit |
1422b7 |
rulebases before real use. Nevertheless, it can be used in production as
|
|
Packit |
1422b7 |
a simple command line interface to liblognorm.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
This tool reads log lines from its standard input and prints results
|
|
Packit |
1422b7 |
to standard output. You need to use redirections if you want to read
|
|
Packit |
1422b7 |
or write files.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
An example of the command::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
$ lognormalizer -r messages.sampdb -e json
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Command line options
|
|
Packit |
1422b7 |
--------------------
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-V
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Output version information, including information about the installed
|
|
Packit |
1422b7 |
version of liblognorm and its optional features. So this may also be
|
|
Packit |
1422b7 |
used to check the currently installed library version.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-r <FILENAME>
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Specifies name of the file containing the rulebase.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-v
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Increase verbosity level. Can be used several times. If used three
|
|
Packit |
1422b7 |
times, internal data structures are dumped (make sense to developers,
|
|
Packit |
1422b7 |
only).
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-p
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Print only successfully parsed messages.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-P
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Print only messages **not** successfully parsed.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-L
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Add line number information to events not successfully parsed. This
|
|
Packit |
1422b7 |
is meant as a troubleshooting aid when working with unparsable events,
|
|
Packit |
1422b7 |
as the information can be used to directly go to the line in question
|
|
Packit |
1422b7 |
in the source data file. The line number is contained in a field
|
|
Packit |
1422b7 |
named ``lognormalizer.line_nbr``.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-t <TAG>
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Print only those messages which have this tag.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-T
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Include 'event.tags' attribute when output is in JSON format. This attribute contains list of tags of the matched
|
|
Packit |
1422b7 |
rule.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-E <DATA>
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Encoder-specific data. For CSV, it is the list of fields to be output,
|
|
Packit |
1422b7 |
separated by comma or space. It is currently unused for other formats.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-d <FILENAME>
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Generate DOT file describing parse tree. It is used to plot parse graph
|
|
Packit |
1422b7 |
with GraphViz.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-H
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
At end of run, print a summary line with number of messages processed,
|
|
Packit |
1422b7 |
parsed and unparsed to stdout.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-U
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
At end of run, print a summary line with number of messages unparsed to
|
|
Packit |
1422b7 |
stdout. Note that this message is only printed if there was at least one
|
|
Packit |
1422b7 |
unparsable message.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-o
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Special options. The following ones can be set:
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
* **allowRegex** Permits to use regular expressions inse the v1 engine
|
|
Packit |
1422b7 |
This is deprecated and should not be used for new deployments.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
* **addExecPath** Includes metadata into the event on how it was
|
|
Packit |
1422b7 |
(tried) to be parsed. Can be useful in troubleshooting normalization
|
|
Packit |
1422b7 |
problems.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
* **addOriginalMsg** Always add the "original-msg" data item. By
|
|
Packit |
1422b7 |
default, this is only done when a message could not be parsed.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
* **addRule** Add a mockup of the rule that was processed. Note that
|
|
Packit |
1422b7 |
it is *not* an exact copy of the rule, but a rule that correctly
|
|
Packit |
1422b7 |
describes the parsed message. Most importantly, prefixes are
|
|
Packit |
1422b7 |
appended and custom data types are expanded (and no longer visiable
|
|
Packit |
1422b7 |
as such). This option is primarily meant for postprocessing, e.g.
|
|
Packit |
1422b7 |
as input to an anonymizer.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
* **addRuleRulcation** For rules that successfully parsed, add the
|
|
Packit |
1422b7 |
location of the rule inside the rulebase. But the file name as
|
|
Packit |
1422b7 |
well as the line number are given. If two rules evaluate to the same
|
|
Packit |
1422b7 |
end node, only a single rule location is given. However, in
|
|
Packit |
1422b7 |
practice this is extremely unlikely and as such for practical
|
|
Packit |
1422b7 |
reasons the information can be considered reliable.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-s <FILENAME>
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
At end of run, print internal parse DAG statistics and exit. This
|
|
Packit |
1422b7 |
option is meant for developers and researches which want to get insight
|
|
Packit |
1422b7 |
into the quality of the algorithm and/or how efficient the rulebase could
|
|
Packit |
1422b7 |
be processed. **NOT** intended for end users. This option is performance
|
|
Packit |
1422b7 |
intense.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-S <FILENAME>
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Even stronger statistics than -s. Requires that the version is compiled
|
|
Packit |
1422b7 |
with --enable-advanced-statistics, which causes a considerable
|
|
Packit |
1422b7 |
performance loss.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-x <FILENAME>
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Print statistics as a DOT file. In order to keep the graph readable,
|
|
Packit |
1422b7 |
information is only emitted for called nodes.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
-e <json|xml|csv|raw|cee-syslog>
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Output format. By default, output is in JSON format. With this option,
|
|
Packit |
1422b7 |
you can change it to a different one.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Supported Output Formats
|
|
Packit |
1422b7 |
........................
|
|
Packit |
1422b7 |
The JSON, XML, and CSV formats should be self-explanatory.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
The cee-syslog format emits messages according to the Mitre CEE spec.
|
|
Packit |
1422b7 |
Note that the cee-syslog format is primarily supported for
|
|
Packit |
1422b7 |
backward-compatibility. It does **not** support nested data items
|
|
Packit |
1422b7 |
and as such cannot be used when the rulebase makes use of this
|
|
Packit |
1422b7 |
feature (we assume this most often happens nowadays). We strongly
|
|
Packit |
1422b7 |
recommend not use it for new deployments. Support may be removed
|
|
Packit |
1422b7 |
in later releases.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
The raw format outputs an exact copy of the input message, without
|
|
Packit |
1422b7 |
any normalization visible. The prime use case of "raw" is to extract
|
|
Packit |
1422b7 |
either all messages that could or could not be normalized. To do so
|
|
Packit |
1422b7 |
specify the -p or -P option. Also, it works in combination with the
|
|
Packit |
1422b7 |
-t option to extract a subset based on tagging. In any case, the core
|
|
Packit |
1422b7 |
use is to prepare a subset of the original file for further processing.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Examples
|
|
Packit |
1422b7 |
--------
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
These examples were created using sample rulebase from source package.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Default (CEE) output::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
$ lognormalizer -r rulebases/sample.rulebase
|
|
Packit |
1422b7 |
Weight: 42kg
|
|
Packit |
1422b7 |
[cee@115 event.tags="tag2" unit="kg" N="42" fat="free"]
|
|
Packit |
1422b7 |
Snow White and the Seven Dwarfs
|
|
Packit |
1422b7 |
[cee@115 event.tags="tale" company="the Seven Dwarfs"]
|
|
Packit |
1422b7 |
2012-10-11 src=127.0.0.1 dst=88.111.222.19
|
|
Packit |
1422b7 |
[cee@115 dst="88.111.222.19" src="127.0.0.1" date="2012-10-11"]
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
JSON output, flat tags enabled::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
$ lognormalizer -r rulebases/sample.rulebase -e json -T
|
|
Packit |
1422b7 |
%%
|
|
Packit |
1422b7 |
{ "event.tags": [ "tag3", "percent" ], "percent": "100", "part": "wha", "whole": "whale" }
|
|
Packit |
1422b7 |
Weight: 42kg
|
|
Packit |
1422b7 |
{ "unit": "kg", "N": "42", "event.tags": [ "tag2" ], "fat": "free" }
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
CSV output with fixed field list::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
$ lognormalizer -r rulebases/sample.rulebase -e csv -E'N unit'
|
|
Packit |
1422b7 |
Weight: 42kg
|
|
Packit |
1422b7 |
"42","kg"
|
|
Packit |
1422b7 |
Weight: 115lbs
|
|
Packit |
1422b7 |
"115","lbs"
|
|
Packit |
1422b7 |
Anything not matching the rule
|
|
Packit |
1422b7 |
,
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Creating a graph of the rulebase
|
|
Packit |
1422b7 |
--------------------------------
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
To get a better overview of a rulebase you can create a graph that shows you
|
|
Packit |
1422b7 |
the chain of normalization (parse-tree).
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
At first you have to install an additional package called graphviz. Graphviz
|
|
Packit |
1422b7 |
is a tool that creates such a graph with the help of a control file (created
|
|
Packit |
1422b7 |
with the rulebase). `Here <http://www.graphviz.org/>`_ you will find more
|
|
Packit |
1422b7 |
information about graphviz.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
To install it you can use the package manager. For example, on RedHat
|
|
Packit |
1422b7 |
systems it is yum command::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
$ sudo yum install graphviz
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
The next step would be creating the control file for graphviz. Therefore we
|
|
Packit |
1422b7 |
use the normalizer command with the options -d "prefered filename for the
|
|
Packit |
1422b7 |
control file" and -r "rulebase"::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
$ lognormalize -d control.dot -r messages.rb
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Please note that there is no need for an input or output file.
|
|
Packit |
1422b7 |
If you have a look at the control file now you will see that the content is
|
|
Packit |
1422b7 |
a little bit confusing, but it includes all information, like the nodes,
|
|
Packit |
1422b7 |
fields and parser, that graphviz needs to create the graph. Of course you
|
|
Packit |
1422b7 |
can edit that file, but please note that it is a lot of work.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Now we can create the graph by typing::
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
$ dot control.dot -Tpng >graph.png
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
dot + name of control file + option -T -> file format + output file
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
That is just one example for using graphviz, of course you can do many
|
|
Packit |
1422b7 |
other great things with it. But I think this "simple" graph could be very
|
|
Packit |
1422b7 |
helpful for the normalizer.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
Below you see sample for such a graph, but please note that this is
|
|
Packit |
1422b7 |
not such a pretty one. Such a graph can grow very fast by editing your
|
|
Packit |
1422b7 |
rulebase.
|
|
Packit |
1422b7 |
|
|
Packit |
1422b7 |
.. figure:: graph.png
|
|
Packit |
1422b7 |
:width: 90 %
|
|
Packit |
1422b7 |
:alt: graph sample
|
|
Packit |
1422b7 |
|