Blame doc/configuration.rst

Packit 1422b7
How to configure
Packit 1422b7
================
Packit 1422b7
Packit 1422b7
To use liblognorm, you need 3 things.
Packit 1422b7
Packit 1422b7
1. An installed and working copy of liblognorm. The installation process 
Packit 1422b7
   has been discussed in the chapter :doc:`installation`.
Packit 1422b7
2. Log files.
Packit 1422b7
3. A rulebase, which is heart of liblognorm configuration.
Packit 1422b7
Packit 1422b7
Log files
Packit 1422b7
---------
Packit 1422b7
Packit 1422b7
A log file is a text file, which typically holds many lines. Each line is 
Packit 1422b7
a log message. These are usually a bit strange to read, thus to analyze. 
Packit 1422b7
This mostly happens, if you have a lot of different devices, that are all 
Packit 1422b7
creating log messages in a different format. 
Packit 1422b7
Packit 1422b7
Rulebase
Packit 1422b7
--------
Packit 1422b7
Packit 1422b7
The rulebase holds all the schemes for your logs. It basically consists of 
Packit 1422b7
many lines that reflect the structure of your log messages. When the 
Packit 1422b7
normalization process is started, a parse-tree will be generated from
Packit 1422b7
the rulebase and put into the memory. This will then be used to parse the 
Packit 1422b7
log messages.
Packit 1422b7
Packit 1422b7
Each line in rulebase file is evaluated separately.
Packit 1422b7
Packit 1422b7
Rulebase Versions
Packit 1422b7
-----------------
Packit 1422b7
This documentation is for liblognorm version 2 and above. Version 2 is a
Packit 1422b7
complete rewrite of liblognorm which offers many enhanced features but
Packit 1422b7
is incompatible to some pre-v2 rulebase commands. For details, see
Packit 1422b7
compatiblity document.
Packit 1422b7
Packit 1422b7
Note that liblognorm v2 contains a full copy of the v1 engine. As such
Packit 1422b7
it is fully compatible to old rulebases. In order to use the new v2
Packit 1422b7
engine, you need to explicitely opt in. To do so, you need to add
Packit 1422b7
the line::
Packit 1422b7
Packit 1422b7
    version=2
Packit 1422b7
Packit 1422b7
to the top of your rulebase file. Currently, it is very important that
Packit 1422b7
Packit 1422b7
 * the line is given exactly as above
Packit 1422b7
 * no whitespace within the sequence is permitted (e.g. "version = 2"
Packit 1422b7
   is invalid)
Packit 1422b7
 * no whitepace or comment after the "2" is permitted
Packit 1422b7
   (e.g. "version=2 # comment") is invalid
Packit 1422b7
 * this line **must** be the **very** first line of the file; this
Packit 1422b7
   also means there **must** not be any comment or empty lines in
Packit 1422b7
   front of it
Packit 1422b7
Packit 1422b7
Only if the version indicator is properly detected, the v2 engine is
Packit 1422b7
used. Otherwise, the v1 engine is used. So if you use v2 features but
Packit 1422b7
got the version line wrong, you'll end up with error messages from the
Packit 1422b7
v1 engine.
Packit 1422b7
Packit 1422b7
The v2 engine understands almost all v1 parsers, and most importantly all
Packit 1422b7
that are typically used. It does not understand these parsers:
Packit 1422b7
Packit 1422b7
 * tokenized
Packit 1422b7
 * recursive
Packit 1422b7
 * descent
Packit 1422b7
 * regex
Packit 1422b7
 * interpret
Packit 1422b7
 * suffixed
Packit 1422b7
 * named_suffixed
Packit 1422b7
Packit 1422b7
The recursive and descent parsers should be replaced by user-defined types
Packit 1422b7
in. The tokenized parsers should be replaced by repeat. The interpret functionality
Packit 1422b7
is provided via the parser's "format" parameters. For the others,
Packit 1422b7
currently there exists no replacement, but will the exception of regex,
Packit 1422b7
will be added based on demand. If you think regex support is urgently
Packit 1422b7
needed, please read our
Packit 1422b7
`related issue on github, <https://github.com/rsyslog/liblognorm/issues/143>`_
Packit 1422b7
where you can also cast
Packit 1422b7
you ballot in favor of it. If you need any of these parsers, you need
Packit 1422b7
to use the v1 engine. That of course means you cannot use the v2 enhancements,
Packit 1422b7
so converting as much as possible makes sense.
Packit 1422b7
Packit 1422b7
Commentaries
Packit 1422b7
------------
Packit 1422b7
Packit 1422b7
To keep your rulebase tidy, you can use commentaries. Start a commentary 
Packit 1422b7
with "#" like in many other configurations. It should look like this::
Packit 1422b7
Packit 1422b7
    # The following prefix and rules are for firewall logs
Packit 1422b7
    
Packit 1422b7
Note that the comment character MUST be in the first column of the line.
Packit 1422b7
Packit 1422b7
Empty lines are just skipped, they can be inserted for readability.
Packit 1422b7
Packit 1422b7
User-Defined Types
Packit 1422b7
------------------
Packit 1422b7
Packit 1422b7
If the line starts with ``type=``, then it contains a user-defined type.
Packit 1422b7
You can use a user-defined type wherever you use a built-in type; they
Packit 1422b7
are equivalent. That also means you can use user-defined types in the
Packit 1422b7
definition of other user-defined types (they can be used recursively).
Packit 1422b7
The only restriction is that you must define a type **before** you can
Packit 1422b7
use it.
Packit 1422b7
Packit 1422b7
This line has following format::
Packit 1422b7
Packit 1422b7
    type=<typename>:<match description>
Packit 1422b7
Packit 1422b7
Everything before the colon is treated as the type name. User-defined types
Packit 1422b7
must always start with "@". So "@mytype" is a valid name, whereas "mytype"
Packit 1422b7
is invalid and will lead to an error.
Packit 1422b7
Packit 1422b7
After the colon, a match description should be
Packit 1422b7
given. It is exactly the same like the one given in rule lines (see below).
Packit 1422b7
Packit 1422b7
A generic IP address type could look as follows::
Packit 1422b7
Packit 1422b7
    type=@IPaddr:%ip:ipv4%
Packit 1422b7
    type=@IPaddr:%ip:ipv6%
Packit 1422b7
Packit 1422b7
This creates a type "@IPaddr", which consists of either an IPv4 or IPv6
Packit 1422b7
address. Note how we use two different lines to create an alternative
Packit 1422b7
representation. This is how things generally work with types: you can use
Packit 1422b7
as many "type" lines for a single type as you need to define your object.
Packit 1422b7
Note that pure alternatives could also be defined via the "alternative"
Packit 1422b7
parser - which option to choose is left to the user. They are equivalent.
Packit 1422b7
The ability to use multiple type lines for definition, however, brings
Packit 1422b7
more power than just to define alternatives.
Packit 1422b7
Packit 1422b7
Includes
Packit 1422b7
--------
Packit 1422b7
Especially with user-defined types includes come handy. With an include,
Packit 1422b7
you can include definitions already made elsewhere into the current
Packit 1422b7
rule set (just like the "include" directive works in many programming
Packit 1422b7
languages). An include is done by a line starting with ``include=``
Packit 1422b7
where the rest of the line is the actual file name, just like in this
Packit 1422b7
example::
Packit 1422b7
Packit 1422b7
   include=/var/lib/liblognorm/stdtypes.rb
Packit 1422b7
Packit 1422b7
The definition is included right at the position where it occurs.
Packit 1422b7
Processing of the original file is continued when the included file
Packit 1422b7
has been fully processed. Includes can be nested.
Packit 1422b7
Packit 1422b7
To facilitate repositories of common rules, liblognorm honors the
Packit 1422b7
Packit 1422b7
::
Packit 1422b7
Packit 1422b7
   LIBLOGNORM_RULEBASES
Packit 1422b7
Packit 1422b7
environment variable. If it is set liblognorm tries to locate the file
Packit 1422b7
inside the path pointed to by ``LIBLOGNORM_RULEBASES`` in the following
Packit 1422b7
case: 
Packit 1422b7
Packit 1422b7
* the provided file cannot be found
Packit 1422b7
* the provided file name is not an absolute path (does not start with "/")
Packit 1422b7
Packit 1422b7
So assuming we have::
Packit 1422b7
Packit 1422b7
   export LIBLOGNORM_RULEBASES=/var/lib/loblognorm
Packit 1422b7
Packit 1422b7
The above example can be re-written as follows::
Packit 1422b7
Packit 1422b7
   include=stdtypes.rb
Packit 1422b7
Packit 1422b7
Note, however, that if ``stdtypes.rb`` exist in the current working
Packit 1422b7
directory, that file will be loaded insted of the one from 
Packit 1422b7
``/var/lib/liblognorm``.
Packit 1422b7
Packit 1422b7
This use facilitates building a library of standard type definitions. Note
Packit 1422b7
the the liblognorm project also ships type definitions for common
Packit 1422b7
scenarios.
Packit 1422b7
Packit 1422b7
Rules
Packit 1422b7
-----
Packit 1422b7
Packit 1422b7
If the line starts with ``rule=``, then it contains a rule. This line has
Packit 1422b7
following format::
Packit 1422b7
Packit 1422b7
    rule=[<tag1>[,<tag2>...]]:<match description>
Packit 1422b7
Packit 1422b7
Everything before a colon is treated as comma-separated list of tags, which
Packit 1422b7
will be attached to a match. After the colon, match description should be
Packit 1422b7
given. It consists of string literals and field selectors. String literals
Packit 1422b7
should match exactly, whereas field selectors may match variable parts
Packit 1422b7
of a message.
Packit 1422b7
Packit 1422b7
A rule could look like this (in legacy format)::
Packit 1422b7
Packit 1422b7
    rule=:%date:date-rfc3164% %host:word% %tag:char-to:\x3a%: no longer listening on %ip:ipv4%#%port:number%'
Packit 1422b7
Packit 1422b7
This excerpt is a common rule. A rule always contains several different 
Packit 1422b7
"parts"/properties and reflects the structure of the message you want to 
Packit 1422b7
normalize (e.g. Host, IP, Source, Syslogtag...).
Packit 1422b7
Packit 1422b7
Packit 1422b7
Literals
Packit 1422b7
--------
Packit 1422b7
Packit 1422b7
Literal is just a sequence of characters, which must match exactly. 
Packit 1422b7
Percent sign characters must be escaped to prevent them from starting a 
Packit 1422b7
field accidentally. Replace each "%" with "\\x25" or "%%", when it occurs
Packit 1422b7
in a string literal.
Packit 1422b7
Packit 1422b7
Fields
Packit 1422b7
------
Packit 1422b7
Packit 1422b7
There are different formats for field specification:
Packit 1422b7
Packit 1422b7
 * legacy format
Packit 1422b7
 * condensed format
Packit 1422b7
 * full json format
Packit 1422b7
Packit 1422b7
Legacy Format
Packit 1422b7
#############
Packit 1422b7
Legay format is exactly identical to the v1 engine. This permits you to use
Packit 1422b7
existing v1 rulebases without any modification with the v2 engine, except for
Packit 1422b7
adding the ``version=2`` header line to the top of the file. Remember: some
Packit 1422b7
v1 types are not supported - if you are among the few who use them, you need
Packit 1422b7
to do some manual conversion. For almost all users, manual conversion should
Packit 1422b7
not be necessary.
Packit 1422b7
Packit 1422b7
Legacy format is not documented here. If you want to use it, see the v1
Packit 1422b7
documentation.
Packit 1422b7
Packit 1422b7
Condensed Format
Packit 1422b7
################
Packit 1422b7
The goal of this format is to be as brief as possible, permitting you an
Packit 1422b7
as-clear-as-possible view of your rule. It is very similar to legacy format
Packit 1422b7
and recommended to be used for simple types which do not need any parser
Packit 1422b7
parameters.
Packit 1422b7
Packit 1422b7
Its structure is as follows::
Packit 1422b7
Packit 1422b7
    %<field name>:<field type>[{<parameters>}]%
Packit 1422b7
Packit 1422b7
**field name** -> that name can be selected freely. It should be a description 
Packit 1422b7
of what kind of information the field is holding, e.g. SRC is the field 
Packit 1422b7
contains the source IP address of the message. These names should also be 
Packit 1422b7
chosen carefully, since the field name can be used in every rule and 
Packit 1422b7
therefore should fit for the same kind of information in different rules.
Packit 1422b7
Packit 1422b7
Some special field names exist:
Packit 1422b7
Packit 1422b7
* **dash** ("-"): this field is matched but not saved
Packit 1422b7
* **dot** ("."): this is useful if a parser returns a set of fields. Usually,
Packit 1422b7
  it does so by creating a json subtree. If the field is named ".", then
Packit 1422b7
  no subtree is created but instead the subfields are moved into the main
Packit 1422b7
  hierarchy.
Packit 1422b7
* **two dots** (".."): similiar to ".", but can be used at the lower level to denote
Packit 1422b7
  that a field is to be included with the name given by the upper-level
Packit 1422b7
  object. Note that ".." is only acted on if a subelement contains a single
Packit 1422b7
  field. The reason is that if there were more, we could not assign all of
Packit 1422b7
  them to the *single* name given by the upper-level-object. The prime
Packit 1422b7
  use case for this special name is in user-defined types that parse only
Packit 1422b7
  a single value. Without "..", they would always become a JSON subtree, which
Packit 1422b7
  seems unnatural and is different from built-in types. So it is suggested to
Packit 1422b7
  name such fields as "..", which means that the user can assign a name of his
Packit 1422b7
  liking, just like in the case of built-in parsers.
Packit 1422b7
Packit 1422b7
**field type** -> selects the accordant parser, which are described below.
Packit 1422b7
Packit 1422b7
Special characters that need to be escaped when used inside a field 
Packit 1422b7
description are "%" and ":". It is strongly recommended **not** to use them.
Packit 1422b7
Packit 1422b7
**parameters** -> This is an optional set of parameters, given in pure JSON
Packit 1422b7
format. Parameters can be generic (e.g. "priority") or specific to a
Packit 1422b7
parser (e.g. "extradata"). Generic parameters are described below in their
Packit 1422b7
own section, parser-specific ones in the relevant type documentation.
Packit 1422b7
Packit 1422b7
As an example, the "char-to" parser accepts a parameter named "extradata"
Packit 1422b7
which describes up to which character it shall match (the name "extradata"
Packit 1422b7
stems back to the legacy v1 system)::
Packit 1422b7
Packit 1422b7
	%tag:char-to{"extradata":":"}%
Packit 1422b7
Packit 1422b7
Whitespace, including LF, is permitted inside a field definition after
Packit 1422b7
the opening precent sign and before the closing one. This can be used to
Packit 1422b7
make complex rules more readable. So the example rule from the overview
Packit 1422b7
section above could be rewritten as::
Packit 1422b7
Packit 1422b7
    rule=:%
Packit 1422b7
          date:date-rfc3164
Packit 1422b7
          % %
Packit 1422b7
	  host:word
Packit 1422b7
	  % %
Packit 1422b7
	  tag:char-to{"extradata":":"}
Packit 1422b7
	  %: no longer listening on %
Packit 1422b7
	  ip:ipv4
Packit 1422b7
	  %#%
Packit 1422b7
	  port:number
Packit 1422b7
	  %'
Packit 1422b7
Packit 1422b7
When doing this, note well that whitespace IS important inside the
Packit 1422b7
literal text. So e.g. in the second example line above "% %" we require
Packit 1422b7
a single SP as literal text. Note that any combination of your liking is
Packit 1422b7
valid, so it could also be written as::
Packit 1422b7
Packit 1422b7
    rule=:%date:date-rfc3164% %host:word% % tag:char-to{"extradata":":"}
Packit 1422b7
          %: no longer listening on %  ip:ipv4  %#%  port:number  %'
Packit 1422b7
Packit 1422b7
To prevent a typical user error, continuation lines are **not** permitted
Packit 1422b7
to start with ``rule=``. There are some obscure cases where this could
Packit 1422b7
be a valid rule, and it can be re-formatted in that case. Moreoften, this
Packit 1422b7
is the result of a missing percent sign, as in this sample::
Packit 1422b7
Packit 1422b7
     rule=:test%field:word ... missing percent sign ...
Packit 1422b7
     rule=:%f:word%
Packit 1422b7
Packit 1422b7
If we would permit ``rule=`` at start of continuation line, these kinds
Packit 1422b7
of problems would be very hard to detect.
Packit 1422b7
Packit 1422b7
Full JSON Format
Packit 1422b7
################
Packit 1422b7
This format is best for complex definitions or if there are many parser
Packit 1422b7
parameters.
Packit 1422b7
Packit 1422b7
Its structure is as follows::
Packit 1422b7
Packit 1422b7
    %JSON%
Packit 1422b7
Packit 1422b7
Where JSON is the configuration expressed in JSON. To get you started, let's
Packit 1422b7
rewrite above sample in pure JSON form::
Packit 1422b7
Packit 1422b7
    rule=:%[ {"type":"date-rfc3164", "name":"date"},
Packit 1422b7
             {"type":"literal", "text:" "},
Packit 1422b7
             {"type":"char-to", "name":"host", "extradata":":"},
Packit 1422b7
             {"type":"literal", "text:": no longer listening on "},
Packit 1422b7
             {"type":"ipv4", "name":"ip"},
Packit 1422b7
             {"type":"literal", "text:"#"},
Packit 1422b7
             {"type":"number", "name":"port"}
Packit 1422b7
            ]%
Packit 1422b7
Packit 1422b7
A couple of things to note:
Packit 1422b7
Packit 1422b7
 * we express everything in this example in a *single* parser definition
Packit 1422b7
 * this is done by using a **JSON array**; whenever an array is used,
Packit 1422b7
   multiple parsers can be specified. They are exectued one after the
Packit 1422b7
   other in given order.
Packit 1422b7
 * literal text is matched here via explicit parser call; as specified
Packit 1422b7
   below, this is recommended only for specific use cases with the
Packit 1422b7
   current version of liblognorm
Packit 1422b7
 * parser parameters (both generic and parser-specific ones) are given
Packit 1422b7
   on the main JSON level
Packit 1422b7
 * the literal text shall not be stored inside an output variable; for
Packit 1422b7
   this reason no name attribute is given (we could also have used
Packit 1422b7
   ``"name":"-"`` which achives the same effect but is more verbose).
Packit 1422b7
Packit 1422b7
With the literal parser calls replaced by actual literals, the sample
Packit 1422b7
looks like this::
Packit 1422b7
Packit 1422b7
    rule=:%{"type":"date-rfc3164", "name":"date"}
Packit 1422b7
          % %
Packit 1422b7
           {"type":"char-to", "name":"host", "extradata":":"}
Packit 1422b7
	  % no longer listening on %
Packit 1422b7
            {"type":"ipv4", "name":"ip"}
Packit 1422b7
	  %#%
Packit 1422b7
            {"type":"number", "name":"port"}
Packit 1422b7
          %
Packit 1422b7
Packit 1422b7
Which format you use and how you exactly use it is up to you.
Packit 1422b7
Packit 1422b7
Some guidelines:
Packit 1422b7
Packit 1422b7
 * using the "literal" parser in JSON should be avoided currently; the
Packit 1422b7
   experimental version does have some rough edges where conflicts
Packit 1422b7
   in literal processing will not be properly handled. This should not
Packit 1422b7
   be an issue in "closed environments", like "repeat", where no such
Packit 1422b7
   conflict can occur.
Packit 1422b7
 * otherwise, JSON is perfect for very complex things (like nesting of
Packit 1422b7
   parsers - it is **not** suggested to use any other format for these
Packit 1422b7
   kinds of things.
Packit 1422b7
 * if a field needs to be matched but the result of that match is not
Packit 1422b7
   needed, omit the "name" attribute; specifically avoid using
Packit 1422b7
   the more verbose ``"name":"-"``.
Packit 1422b7
 * it is a good idea to start each defintion with ``"type":"..."``
Packit 1422b7
   as this provides a good quick overview over what is being defined.
Packit 1422b7
 
Packit 1422b7
Mandatory Parameters
Packit 1422b7
....................
Packit 1422b7
Packit 1422b7
type
Packit 1422b7
~~~~
Packit 1422b7
The field type, selects the parser to use. See "fields" below for description.
Packit 1422b7
Packit 1422b7
Optional Generic Parameters
Packit 1422b7
...........................
Packit 1422b7
Packit 1422b7
name
Packit 1422b7
~~~~
Packit 1422b7
The field name to use. If "-" is used, the field is matched, but not stored.
Packit 1422b7
In this case, you can simply **not** specify a field name, which is the
Packit 1422b7
preferred way of doing this.
Packit 1422b7
Packit 1422b7
priority
Packit 1422b7
~~~~~~~~
Packit 1422b7
The priority to assign to this parser. Priorities are numerical values in the
Packit 1422b7
range from 0 (highest) to 65535 (lowest). If multiple parsers could match at
Packit 1422b7
a given character position of a log line, parsers are tried in priority order.
Packit 1422b7
Different priorities can lead to different parsing. For example, if the
Packit 1422b7
greedy "rest" type is assigned priority 0, and no other parser is assigned the
Packit 1422b7
same priority, no other parser will ever match (because "rest" is very greedy
Packit 1422b7
and always matches the rest of the message).
Packit 1422b7
Packit 1422b7
Note that liblognorm internally
Packit 1422b7
has a parser-specific priority, which is selected by the program developer based
Packit 1422b7
on the specificallity of a type. If the user assigns equal priorities, parsers are
Packit 1422b7
executed based on the parser-specific priority.
Packit 1422b7
Packit 1422b7
The default priority value is 30,000.
Packit 1422b7
Packit 1422b7
Field types
Packit 1422b7
-----------
Packit 1422b7
We have legacy and regular field types. Pre-v2, we did not have user-defined types.
Packit 1422b7
As such, there was a relatively large number of parsers that handled very similar
Packit 1422b7
cases, for example for strings. These parsers still work and may even provide
Packit 1422b7
best performance in extreme cases. In v2, we focus on fewer, but more
Packit 1422b7
generic parsers, which are then tailored via parameters.
Packit 1422b7
Packit 1422b7
There is nothing bad about using legacy parsers and there is no
Packit 1422b7
plan to outphase them at any time in the future. We just wanted to
Packit 1422b7
let you know, especially if you wonder about some "wereid" parsers.
Packit 1422b7
In v1, parsers could have only a single paramter, which was called
Packit 1422b7
"extradata" at that time. This is why some of the legacy parsers
Packit 1422b7
require or support a parameter named "extradata" and do not use a
Packit 1422b7
better name for it (internally, the legacy format creates a
Packit 1422b7
v2 parser defintion with "extradata" being populated from the
Packit 1422b7
legacy "extradata" part of the configuration).
Packit 1422b7
Packit 1422b7
number
Packit 1422b7
######
Packit 1422b7
Packit 1422b7
One or more decimal digits.
Packit 1422b7
Packit 1422b7
Parameters
Packit 1422b7
..........
Packit 1422b7
Packit 1422b7
format
Packit 1422b7
~~~~~~
Packit 1422b7
Packit 1422b7
Specifies the format of the json object. Possible values are "string" and
Packit 1422b7
"number", with string being the default. If "number" is used, the json
Packit 1422b7
object will be a native json integer.
Packit 1422b7
Packit 1422b7
maxval
Packit 1422b7
~~~~~~
Packit 1422b7
Packit 1422b7
Maximum value permitted for this number. If the value is higher than this,
Packit 1422b7
it will not be detected by this parser definition and an alternate detection
Packit 1422b7
path will be pursued.
Packit 1422b7
Packit 1422b7
float
Packit 1422b7
#####
Packit 1422b7
Packit 1422b7
A floating-pt number represented in non-scientific form.
Packit 1422b7
Packit 1422b7
Parameters
Packit 1422b7
..........
Packit 1422b7
Packit 1422b7
format
Packit 1422b7
~~~~~~
Packit 1422b7
Packit 1422b7
Specifies the format of the json object. Possible values are "string" and
Packit 1422b7
"number", with string being the default. If "number" is used, the json
Packit 1422b7
object will be a native json floating point number. Note that we try to
Packit 1422b7
preserve the original string serialization format, but keep on your mind
Packit 1422b7
that floating point numbers are inherently imprecise, so slight variance
Packit 1422b7
may occur depending on processing them.
Packit 1422b7
Packit 1422b7
Packit 1422b7
hexnumber
Packit 1422b7
#########
Packit 1422b7
Packit 1422b7
A hexadecimal number as seen by this parser begins with the string
Packit 1422b7
"0x", is followed by 1 or more hex digits and is terminated by white
Packit 1422b7
space. Any interleaving non-hex digits will cause non-detection. The
Packit 1422b7
rules are strict to avoid false positives.
Packit 1422b7
Packit 1422b7
Parameters
Packit 1422b7
..........
Packit 1422b7
Packit 1422b7
format
Packit 1422b7
~~~~~~
Packit 1422b7
Packit 1422b7
Specifies the format of the json object. Possible values are "string" and
Packit 1422b7
"number", with string being the default. If "number" is used, the json
Packit 1422b7
object will be a native json integer. Note that json numbers are always
Packit 1422b7
decimal, so if "number" is selected, the hex number will be converted
Packit 1422b7
to decimal. The original hex string is no longer available in this case.
Packit 1422b7
Packit 1422b7
maxval
Packit 1422b7
~~~~~~
Packit 1422b7
Packit 1422b7
Maximum value permitted for this number. If the value is higher than this,
Packit 1422b7
it will not be detected by this parser definition and an alternate detection
Packit 1422b7
path will be pursued. This is most useful if fixed-size hex numbers need to
Packit 1422b7
be processed. For example, for byte values the "maxval" could be set to 255,
Packit 1422b7
which ensures that invalid values are not misdetected.
Packit 1422b7
Packit 1422b7
Packit 1422b7
kernel-timestamp
Packit 1422b7
################
Packit 1422b7
Packit 1422b7
Parses a linux kernel timestamp, which has the format::
Packit 1422b7
Packit 1422b7
    [ddddd.dddddd]
Packit 1422b7
Packit 1422b7
where "d" is a decimal digit. The part before the period has to
Packit 1422b7
have at least 5 digits as per kernel code. There is no upper
Packit 1422b7
limit per se inside the kernel, but liblognorm does not accept
Packit 1422b7
more than 12 digits, which seems more than sufficient (we may reduce
Packit 1422b7
the max count if misdetections occur). The part after the period
Packit 1422b7
has to have exactly 6 digits.
Packit 1422b7
Packit 1422b7
Packit 1422b7
whitespace
Packit 1422b7
##########
Packit 1422b7
Packit 1422b7
This parses all whitespace until the first non-whitespace character
Packit 1422b7
is found. This check is performed using the ``isspace()`` C library
Packit 1422b7
function to check for space, horizontal tab, newline, vertical tab,
Packit 1422b7
feed and carriage return characters.
Packit 1422b7
Packit 1422b7
This parser is primarily a tool to skip to the next "word" if
Packit 1422b7
the exact number of whitspace characters (and type of whitespace)
Packit 1422b7
is not known. The current parsing position MUST be on a whitspace,
Packit 1422b7
else the parser does not match.
Packit 1422b7
Packit 1422b7
Remeber that to just parse but not preserve the field contents, the
Packit 1422b7
dash ("-") is used as field name in compact format or the "name" 
Packit 1422b7
parameter is simply omitted in JSON format. This is almost always
Packit 1422b7
expected with the *whitespace* type.
Packit 1422b7
Packit 1422b7
string
Packit 1422b7
######
Packit 1422b7
Packit 1422b7
This is a highly customizable parser that can be used to extract
Packit 1422b7
many types of strings. It is meant to be used for most cases. It
Packit 1422b7
is suggested that specific string types are created as user-defined
Packit 1422b7
types using this parser.
Packit 1422b7
Packit 1422b7
This parser supports:
Packit 1422b7
Packit 1422b7
* various quoting modes for strings
Packit 1422b7
* escape character processing
Packit 1422b7
Packit 1422b7
Parameters
Packit 1422b7
..........
Packit 1422b7
Packit 1422b7
quoting.mode
Packit 1422b7
~~~~~~~~~~~~
Packit 1422b7
Specifies how the string is quoted. Possible modes:
Packit 1422b7
Packit 1422b7
* **none** - no quoting is permitted
Packit 1422b7
* **required** - quotes must be present
Packit 1422b7
* **auto** - quotes are permitted, but not required
Packit 1422b7
Packit 1422b7
Default is ``auto``.
Packit 1422b7
Packit 1422b7
quoting.escape.mode
Packit 1422b7
~~~~~~~~~~~~~~~~~~~
Packit 1422b7
Packit 1422b7
Specifies how quote character escaping is handled. Possible modes:
Packit 1422b7
Packit 1422b7
* **none** - there are no escapes, quote characters are *not* permitted in value
Packit 1422b7
* **double** - the ending quote character is duplicated to indicate
Packit 1422b7
  a single quote without termination of the value (e.g. ``""``)
Packit 1422b7
* **backslash** - a backslash is prepended to the quote character (e.g ``\"``)
Packit 1422b7
* **both** - both double and backslash escaping can happen and are supported
Packit 1422b7
Packit 1422b7
Note that turning on ``backslash`` mode (or ``both``) has the side-effect that
Packit 1422b7
backslash escaping is enabled in general. This usually is what you want
Packit 1422b7
if this option is selected (e.g. otherwise you could no longer represent
Packit 1422b7
backslash).
Packit 1422b7
Packit 1422b7
quoting.char.begin
Packit 1422b7
~~~~~~~~~~~~~~~~~~
Packit 1422b7
Packit 1422b7
Sets the begin quote character.
Packit 1422b7
Packit 1422b7
Default is ".
Packit 1422b7
Packit 1422b7
quoting.char.end
Packit 1422b7
~~~~~~~~~~~~~~~~
Packit 1422b7
Packit 1422b7
Sets the end quote character.
Packit 1422b7
Packit 1422b7
Default is ".
Packit 1422b7
Packit 1422b7
Note that setting the begin and end quote character permits you to
Packit 1422b7
support more quoting modes. For example, brackets and braces are
Packit 1422b7
used by some software for quoting. To handle such string, you can for
Packit 1422b7
example use a configuration like this::
Packit 1422b7
Packit 1422b7
   rule=:a %f:string{"quoting.char.begin":"[", "quoting.char.end":"]"}% b
Packit 1422b7
Packit 1422b7
which matches strings like this::
Packit 1422b7
Packit 1422b7
   a [test test2] b
Packit 1422b7
Packit 1422b7
matching.permitted
Packit 1422b7
~~~~~~~~~~~~~~~~~~
Packit 1422b7
Packit 1422b7
This allows to specify a set of characters permitted in the to-be-parsed
Packit 1422b7
field. It is primarily a utility to extract things like programming-language
Packit 1422b7
like names (e.g. consisting of letters, digits and a set of special characters
Packit 1422b7
only), alphanumeric or alphabetic strings.
Packit 1422b7
Packit 1422b7
If this parameter is not specified, all characters are permitted. If it
Packit 1422b7
is specified, only the configured characters are permitted.
Packit 1422b7
Packit 1422b7
Note that this option reliably only works on US-ASCII data. Multi-byte
Packit 1422b7
character encodings may lead to strange results.
Packit 1422b7
Packit 1422b7
There are two ways to specify permitted characters. The simple one is to
Packit 1422b7
specify them directly for the parameter::
Packit 1422b7
Packit 1422b7
  rule=:%f:string{"matching.permitted":"abc"}%
Packit 1422b7
Packit 1422b7
This only supports literal characters and all must be given as a single
Packit 1422b7
parameter. For more advanced use cases, an array of permitted characters
Packit 1422b7
can be provided::
Packit 1422b7
Packit 1422b7
  rule=:%f:string{"matching.permitted":[
Packit 1422b7
		       {"class":"digit"},
Packit 1422b7
		       {"chars":"xX"}
Packit 1422b7
                          ]}%
Packit 1422b7
Packit 1422b7
Here, ``class`` is a specify for the usual character classes, with
Packit 1422b7
support for:
Packit 1422b7
Packit 1422b7
* digit
Packit 1422b7
* hexdigit
Packit 1422b7
* alpha
Packit 1422b7
* alnum
Packit 1422b7
Packit 1422b7
In contrast, ``chars`` permits to specify literal characters. Both
Packit 1422b7
``class`` as well as ``chars`` may be specified multiple times inside
Packit 1422b7
the array. For example, the ``alnum`` class could also be permitted as
Packit 1422b7
follows::
Packit 1422b7
Packit 1422b7
  rule=:%f:string{"matching.permitted":[
Packit 1422b7
		       {"class":"digit"},
Packit 1422b7
		       {"class":"alpha"}
Packit 1422b7
                          ]}%
Packit 1422b7
Packit 1422b7
word
Packit 1422b7
####
Packit 1422b7
Packit 1422b7
One or more characters, up to the next space (\\x20), or
Packit 1422b7
up to end of line.
Packit 1422b7
Packit 1422b7
string-to
Packit 1422b7
######### 
Packit 1422b7
Packit 1422b7
One or more characters, up to the next string given in
Packit 1422b7
"extradata".
Packit 1422b7
Packit 1422b7
alpha
Packit 1422b7
#####   
Packit 1422b7
Packit 1422b7
One or more alphabetic characters, up to the next whitspace, punctuation,
Packit 1422b7
decimal digit or control character.
Packit 1422b7
Packit 1422b7
char-to
Packit 1422b7
####### 
Packit 1422b7
Packit 1422b7
One or more characters, up to the next character(s) given in
Packit 1422b7
extradata.
Packit 1422b7
Packit 1422b7
Parameters
Packit 1422b7
..........
Packit 1422b7
Packit 1422b7
extradata
Packit 1422b7
~~~~~~~~~
Packit 1422b7
Packit 1422b7
This is a mandatory parameter. It contains one or more characters, each of
Packit 1422b7
which terminates the match.
Packit 1422b7
Packit 1422b7
Packit 1422b7
char-sep
Packit 1422b7
########
Packit 1422b7
Packit 1422b7
Zero or more characters, up to the next character(s) given in extradata.
Packit 1422b7
Packit 1422b7
Parameters
Packit 1422b7
..........
Packit 1422b7
Packit 1422b7
extradata
Packit 1422b7
~~~~~~~~~~
Packit 1422b7
Packit 1422b7
This is a mandatory parameter. It contains one or more characters, each of
Packit 1422b7
which terminates the match.
Packit 1422b7
Packit 1422b7
rest
Packit 1422b7
####
Packit 1422b7
Packit 1422b7
Zero or more characters untill end of line. Must always be at end of the 
Packit 1422b7
rule, even though this condition is currently **not** checked. In any case,
Packit 1422b7
any definitions after *rest* are ignored.
Packit 1422b7
Packit 1422b7
Note that the *rest* syntax should be avoided because it generates
Packit 1422b7
a very broad match. If it needs to be used, the user shall assign it
Packit 1422b7
the lowest priority among his parser definitions. Note that the
Packit 1422b7
parser-sepcific priority is also lowest, so by default it will only
Packit 1422b7
match if nothing else matches.
Packit 1422b7
Packit 1422b7
quoted-string
Packit 1422b7
#############   
Packit 1422b7
Packit 1422b7
Zero or more characters, surrounded by double quote marks.
Packit 1422b7
Quote marks are stripped from the match.
Packit 1422b7
Packit 1422b7
op-quoted-string
Packit 1422b7
################   
Packit 1422b7
Packit 1422b7
Zero or more characters, possibly surrounded by double quote marks.
Packit 1422b7
If the first character is a quote mark, operates like quoted-string. Otherwise, operates like "word"
Packit 1422b7
Quote marks are stripped from the match.
Packit 1422b7
Packit 1422b7
date-iso
Packit 1422b7
########    
Packit 1422b7
Date in ISO format ('YYYY-MM-DD').
Packit 1422b7
Packit 1422b7
time-24hr
Packit 1422b7
#########   
Packit 1422b7
Packit 1422b7
Time of format 'HH:MM:SS', where HH is 00..23.
Packit 1422b7
Packit 1422b7
time-12hr
Packit 1422b7
#########   
Packit 1422b7
Packit 1422b7
Time of format 'HH:MM:SS', where HH is 00..12.
Packit 1422b7
Packit 1422b7
duration
Packit 1422b7
########   
Packit 1422b7
Packit 1422b7
A duration is similar to a timestamp, except that
Packit 1422b7
it tells about time elapsed. As such, hours can be larger than 23
Packit 1422b7
and hours may also be specified by a single digit (this, for example,
Packit 1422b7
is commonly done in Cisco software).
Packit 1422b7
Packit 1422b7
Examples for durations are "12:05:01", "0:00:01" and "37:59:59" but not
Packit 1422b7
"00:60:00" (HH and MM must still be within the usual range for
Packit 1422b7
minutes and seconds).
Packit 1422b7
Packit 1422b7
Packit 1422b7
date-rfc3164
Packit 1422b7
############
Packit 1422b7
Packit 1422b7
Valid date/time in RFC3164 format, i.e.: 'Oct 29 09:47:08'.
Packit 1422b7
This parser implements several quirks to match malformed
Packit 1422b7
timestamps from some devices.
Packit 1422b7
Packit 1422b7
Parameters
Packit 1422b7
..........
Packit 1422b7
Packit 1422b7
format
Packit 1422b7
~~~~~~
Packit 1422b7
Packit 1422b7
Specifies the format of the json object. Possible values are
Packit 1422b7
Packit 1422b7
- **string** - string representation as given in input data
Packit 1422b7
- **timestamp-unix** - string converted to an unix timestamp (seconds since epoch)
Packit 1422b7
- **timestamp-unix-ms** - a kind of unix-timestamp, but with millisecond resolution.
Packit 1422b7
  This format is understood for example by ElasticSearch. Note that RFC3164 does **not**
Packit 1422b7
  contain subsecond resolution, so this option makes no sense for RFC3164-data only.
Packit 1422b7
  It is usefull, howerver, if processing mixed sources, some of which contain higher
Packit 1422b7
  precision.
Packit 1422b7
Packit 1422b7
Packit 1422b7
date-rfc5424
Packit 1422b7
############
Packit 1422b7
Packit 1422b7
Valid date/time in RFC5424 format, i.e.:
Packit 1422b7
'1985-04-12T19:20:50.52-04:00'.
Packit 1422b7
Slightly different formats are allowed.
Packit 1422b7
Packit 1422b7
Parameters
Packit 1422b7
..........
Packit 1422b7
Packit 1422b7
format
Packit 1422b7
~~~~~~
Packit 1422b7
Packit 1422b7
Specifies the format of the json object. Possible values are
Packit 1422b7
Packit 1422b7
- **string** - string representation as given in input data
Packit 1422b7
- **timestamp-unix** - string converted to an unix timestamp (seconds since epoch).
Packit 1422b7
  If subsecond resolution is given in the original timestamp, it is lost.
Packit 1422b7
- **timestamp-unix-ms** - a kind of unix-timestamp, but with millisecond resolution.
Packit 1422b7
  This format is understood for example by ElasticSearch. Note that a RFC5424
Packit 1422b7
  timestamp can contain higher than ms resolution. If so, the timestamp is
Packit 1422b7
  truncated to millisecond resolution.
Packit 1422b7
Packit 1422b7
Packit 1422b7
Packit 1422b7
ipv4
Packit 1422b7
####
Packit 1422b7
Packit 1422b7
IPv4 address, in dot-decimal notation (AAA.BBB.CCC.DDD).
Packit 1422b7
Packit 1422b7
ipv6
Packit 1422b7
####
Packit 1422b7
Packit 1422b7
IPv6 address, in textual notation as specified in RFC4291.
Packit 1422b7
All formats specified in section 2.2 are supported, including
Packit 1422b7
embedded IPv4 address (e.g. "::13.1.68.3"). Note that a 
Packit 1422b7
**pure** IPv4 address ("13.1.68.3") is **not** valid and as
Packit 1422b7
such not recognized.
Packit 1422b7
Packit 1422b7
To avoid false positives, there must be either a whitespace
Packit 1422b7
character after the IPv6 address or the end of string must be
Packit 1422b7
reached.
Packit 1422b7
Packit 1422b7
mac48
Packit 1422b7
#####
Packit 1422b7
Packit 1422b7
The standard (IEEE 802) format for printing MAC-48 addresses in
Packit 1422b7
human-friendly form is six groups of two hexadecimal digits,
Packit 1422b7
separated by hyphens (-) or colons (:), in transmission order
Packit 1422b7
(e.g. 01-23-45-67-89-ab or 01:23:45:67:89:ab ).
Packit 1422b7
This form is also commonly used for EUI-64.
Packit 1422b7
from: http://en.wikipedia.org/wiki/MAC_address
Packit 1422b7
Packit 1422b7
cef
Packit 1422b7
###
Packit 1422b7
Packit 1422b7
This parses ArcSight Comment Event Format (CEF) as described in 
Packit 1422b7
the "Implementing ArcSight CEF" manual revision 20 (2013-06-15).
Packit 1422b7
Packit 1422b7
It matches a format that closely follows the spec. The header fields
Packit 1422b7
are extracted into the field name container, all extension are
Packit 1422b7
extracted into a container called "Extensions" beneath it.
Packit 1422b7
Packit 1422b7
Example
Packit 1422b7
.......
Packit 1422b7
Packit 1422b7
Rule (compact format)::
Packit 1422b7
Packit 1422b7
    rule=:%f:cef'
Packit 1422b7
Packit 1422b7
Data::
Packit 1422b7
Packit 1422b7
    CEF:0|Vendor|Product|Version|Signature ID|some name|Severity| aa=field1 bb=this is a value cc=field 3
Packit 1422b7
Packit 1422b7
Result::
Packit 1422b7
Packit 1422b7
    {
Packit 1422b7
      "f": {
Packit 1422b7
        "DeviceVendor": "Vendor",
Packit 1422b7
        "DeviceProduct": "Product",
Packit 1422b7
        "DeviceVersion": "Version",
Packit 1422b7
        "SignatureID": "Signature ID",
Packit 1422b7
        "Name": "some name",
Packit 1422b7
        "Severity": "Severity",
Packit 1422b7
        "Extensions": {
Packit 1422b7
          "aa": "field1",
Packit 1422b7
          "bb": "this is a value",
Packit 1422b7
          "cc": "field 3"
Packit 1422b7
        }
Packit 1422b7
      }
Packit 1422b7
    }
Packit 1422b7
Packit 1422b7
checkpoint-lea
Packit 1422b7
##############
Packit 1422b7
Packit 1422b7
This supports the LEA on-disk format. Unfortunately, the format
Packit 1422b7
is underdocumented, the Checkpoint docs we could get hold of just
Packit 1422b7
describe the API and provide a field dictionary. In a nutshell, what
Packit 1422b7
we do is extract field names up to the colon and values up to the
Packit 1422b7
semicolon. No escaping rules are known to us, so we assume none
Packit 1422b7
exists (and as such no semicolon can be part of a value).
Packit 1422b7
Packit 1422b7
If someone has a definitive reference or a sample set to contribute
Packit 1422b7
to the project, please let us know and we will check if we need to
Packit 1422b7
add additional transformations.
Packit 1422b7
Packit 1422b7
Packit 1422b7
cisco-interface-spec
Packit 1422b7
####################
Packit 1422b7
Packit 1422b7
A Cisco interface specifier, as for example seen in PIX or ASA.
Packit 1422b7
The format contains a number of optional parts and is described
Packit 1422b7
as follows (in ABNF-like manner where square brackets indicate
Packit 1422b7
optional parts):
Packit 1422b7
Packit 1422b7
::
Packit 1422b7
Packit 1422b7
  [interface:]ip/port [SP (ip2/port2)] [[SP](username)]
Packit 1422b7
Packit 1422b7
Samples for such a spec are:
Packit 1422b7
Packit 1422b7
 * outside:192.168.52.102/50349
Packit 1422b7
 * inside:192.168.1.15/56543 (192.168.1.112/54543)
Packit 1422b7
 * outside:192.168.1.13/50179 (192.168.1.13/50179)(LOCAL\some.user)
Packit 1422b7
 * outside:192.168.1.25/41850(LOCAL\RG-867G8-DEL88D879BBFFC8) 
Packit 1422b7
 * inside:192.168.1.25/53 (192.168.1.25/53) (some.user)
Packit 1422b7
 * 192.168.1.15/0(LOCAL\RG-867G8-DEL88D879BBFFC8)
Packit 1422b7
Packit 1422b7
Note that the current verision of liblognorm does not permit sole
Packit 1422b7
IP addresses to be detected as a Cisco interface spec. However, we
Packit 1422b7
are reviewing more Cisco message and need to decide if this is
Packit 1422b7
to be supported. The problem here is that this would create a much
Packit 1422b7
broader parser which would potentially match many things that are
Packit 1422b7
**not** Cisco interface specs.
Packit 1422b7
Packit 1422b7
As this object extracts multiple subelements, it create a JSON
Packit 1422b7
structure. 
Packit 1422b7
Packit 1422b7
Let's for example look at this definiton (compact format)::
Packit 1422b7
Packit 1422b7
    %ifaddr:cisco-interface-spec%
Packit 1422b7
Packit 1422b7
and assume the following message is to be parsed::
Packit 1422b7
Packit 1422b7
 outside:192.168.1.13/50179 (192.168.1.13/50179) (LOCAL\some.user)
Packit 1422b7
Packit 1422b7
Then the resulting JSON will be as follows::
Packit 1422b7
Packit 1422b7
{ "ifaddr": { "interface": "outside", "ip": "192.168.1.13", "port": "50179", "ip2": "192.168.1.13", "port2": "50179", "user": "LOCAL\\some.user" } }
Packit 1422b7
Packit 1422b7
Subcomponents that are not given in the to-be-normalized string are
Packit 1422b7
also not present in the resulting JSON.
Packit 1422b7
Packit 1422b7
iptables
Packit 1422b7
########    
Packit 1422b7
Packit 1422b7
Name=value pairs, separated by spaces, as in Netfilter log messages.
Packit 1422b7
Name of the selector is not used; names from the line are 
Packit 1422b7
used instead. This selector always matches everything till 
Packit 1422b7
end of the line. Cannot match zero characters.
Packit 1422b7
Packit 1422b7
cisco-interface-spec
Packit 1422b7
####################
Packit 1422b7
Packit 1422b7
This is an experimental parser. It is used to detect Cisco Interface
Packit 1422b7
Specifications. A sample of them is:
Packit 1422b7
Packit 1422b7
::
Packit 1422b7
Packit 1422b7
   outside:176.97.252.102/50349
Packit 1422b7
Packit 1422b7
Note that this parser does not yet extract the individual parts
Packit 1422b7
due to the restrictions in current liblognorm. This is planned for
Packit 1422b7
after a general algorithm overhaul.
Packit 1422b7
Packit 1422b7
In order to match, this syntax must start on a non-whitespace char
Packit 1422b7
other than colon.
Packit 1422b7
Packit 1422b7
json
Packit 1422b7
####
Packit 1422b7
This parses native JSON from the message. All data up to the first non-JSON
Packit 1422b7
is parsed into the field. There may be any other field after the JSON,
Packit 1422b7
including another JSON section.
Packit 1422b7
Packit 1422b7
Note that any white space after the actual JSON
Packit 1422b7
is considered **to be part of the JSON**. So you cannot filter on whitespace
Packit 1422b7
after the JSON.
Packit 1422b7
Packit 1422b7
Example
Packit 1422b7
.......
Packit 1422b7
Packit 1422b7
Rule (compact format)::
Packit 1422b7
Packit 1422b7
    rule=:%field1:json%interim text %field2:json%'
Packit 1422b7
Packit 1422b7
Data::
Packit 1422b7
Packit 1422b7
   {"f1": "1"} interim text {"f2": 2}
Packit 1422b7
Packit 1422b7
Result::
Packit 1422b7
Packit 1422b7
   { "field2": { "f2": 2 }, "field1": { "f1": "1" } }
Packit 1422b7
Packit 1422b7
Note also that the space before "interim" must **not** be given in the
Packit 1422b7
rule, as it is consumed by the JSON parser. However, the space after
Packit 1422b7
"text" is required.
Packit 1422b7
Packit 1422b7
alternative
Packit 1422b7
###########
Packit 1422b7
Packit 1422b7
This type permits to specify alternative ways of parsing within a single
Packit 1422b7
definition. This can make writing rule bases easier. It also permits the
Packit 1422b7
v2 engine to create a more efficient parsing data structure resulting in
Packit 1422b7
better performance (to be noticed only in extreme cases, though).
Packit 1422b7
Packit 1422b7
An example explains this parser best::
Packit 1422b7
Packit 1422b7
    rule=:a %
Packit 1422b7
            {"type":"alternative",
Packit 1422b7
	     "parser": [
Packit 1422b7
	                {"name":"num", "type":"number"},
Packit 1422b7
			{"name":"hex", "type":"hexnumber"}
Packit 1422b7
		       ]
Packit 1422b7
	    }% b
Packit 1422b7
Packit 1422b7
This rule matches messages like these::
Packit 1422b7
Packit 1422b7
   a 1234 b
Packit 1422b7
   a 0xff b
Packit 1422b7
Packit 1422b7
Note that the "parser" parameter here needs to be provided with an array
Packit 1422b7
of *alternatives*. In this case, the JSON array is **not** interpreted as
Packit 1422b7
a sequence. Note, though that you can nest defintions by using custom types.
Packit 1422b7
 
Packit 1422b7
repeat
Packit 1422b7
######
Packit 1422b7
This parser is used to extract a repeated sequence with the same pattern.
Packit 1422b7
Packit 1422b7
An example explains this parser best::
Packit 1422b7
Packit 1422b7
    rule=:a %
Packit 1422b7
            {"name":"numbers", "type":"repeat",
Packit 1422b7
                "parser":[
Packit 1422b7
                           {"type":"number", "name":"n1"},
Packit 1422b7
                           {"type":"literal", "text":":"},
Packit 1422b7
	                   {"type":"number", "name":"n2"}
Packit 1422b7
	                 ],
Packit 1422b7
	        "while":[
Packit 1422b7
	                   {"type":"literal", "text":", "}
Packit 1422b7
	                ]
Packit 1422b7
             }% b
Packit 1422b7
Packit 1422b7
This matches lines like this::
Packit 1422b7
    
Packit 1422b7
    a 1:2, 3:4, 5:6, 7:8 b
Packit 1422b7
Packit 1422b7
and will generate this JSON::
Packit 1422b7
Packit 1422b7
    { "numbers": [
Packit 1422b7
                   { "n2": "2", "n1": "1" },
Packit 1422b7
		   { "n2": "4", "n1": "3" },
Packit 1422b7
		   { "n2": "6", "n1": "5" },
Packit 1422b7
		   { "n2": "8", "n1": "7" }
Packit 1422b7
		 ]
Packit 1422b7
    }
Packit 1422b7
Packit 1422b7
As can be seen, there are two parameters to "alternative". The parser
Packit 1422b7
parameter specifies which type should be repeatedly parsed out of
Packit 1422b7
the input data. We could use a single parser for that, but in the example
Packit 1422b7
above we parse a sequence. Note the nested array in the "parser" parameter.
Packit 1422b7
Packit 1422b7
If we just wanted to match a single list of numbers like::
Packit 1422b7
Packit 1422b7
    a 1, 2, 3, 4 b
Packit 1422b7
Packit 1422b7
we could use this definition::
Packit 1422b7
Packit 1422b7
    rule=:a %
Packit 1422b7
            {"name":"numbers", "type":"repeat",
Packit 1422b7
                "parser":
Packit 1422b7
                         {"type":"number", "name":"n"},
Packit 1422b7
	        "while":
Packit 1422b7
	                 {"type":"literal", "text":", "}
Packit 1422b7
             }% b
Packit 1422b7
Packit 1422b7
Note that in this example we also removed the redundant single-element
Packit 1422b7
array in "while".
Packit 1422b7
Packit 1422b7
The "while" parameter tells "repeat" how long to do repeat processing. It
Packit 1422b7
is specified by any parser, including a nested sequence of parser (array).
Packit 1422b7
As long as the "while" part matches, the repetition is continued. If it no
Packit 1422b7
longer matches, "repeat" processing is successfully completed. Note that
Packit 1422b7
the "parser" parameter **must** match at least once, otherwise "repeat"
Packit 1422b7
fails.
Packit 1422b7
Packit 1422b7
In the above sample, "while" mismatches after "4", because no ", " follows.
Packit 1422b7
Then, the parser termiantes, and according to definition the literal " b"
Packit 1422b7
is matched, which will result in a successful rule match (note: the "a ",
Packit 1422b7
" b" literals are just here for explanatory purposes and could be any
Packit 1422b7
other rule element).
Packit 1422b7
Packit 1422b7
Sometimes we need to deal with malformed messages. For example, we
Packit 1422b7
could have a sequence like this::
Packit 1422b7
Packit 1422b7
    a 1:2, 3:4,5:6, 7:8 b
Packit 1422b7
Packit 1422b7
Note the missing space after "4,". To handle such cases, we can nest the
Packit 1422b7
"alternative" parser inside "while"::
Packit 1422b7
Packit 1422b7
    rule=:a %
Packit 1422b7
            {"name":"numbers", "type":"repeat",
Packit 1422b7
                "parser":[
Packit 1422b7
                           {"type":"number", "name":"n1"},
Packit 1422b7
                           {"type":"literal", "text":":"},
Packit 1422b7
	                   {"type":"number", "name":"n2"}
Packit 1422b7
	                 ],
Packit 1422b7
                "while": {
Packit 1422b7
                            "type":"alternative", "parser": [
Packit 1422b7
                                    {"type":"literal", "text":", "},
Packit 1422b7
                                    {"type":"literal", "text":","}
Packit 1422b7
                             ]
Packit 1422b7
                         }
Packit 1422b7
             }% b
Packit 1422b7
Packit 1422b7
This definition handles numbers being delemited by either ", " or ",".
Packit 1422b7
Packit 1422b7
For people with programming skills, the "repeat" parser is described
Packit 1422b7
by this pseudocode::
Packit 1422b7
Packit 1422b7
    do
Packit 1422b7
        parse via parsers given in "parser"
Packit 1422b7
	if parsing fails
Packit 1422b7
	    abort "repeat" unsuccessful
Packit 1422b7
	parse via parsers given in "while"
Packit 1422b7
    while the "while" parsers parsed successfully
Packit 1422b7
    if not aborted, flag "repeat" as successful
Packit 1422b7
Packit 1422b7
Parameters
Packit 1422b7
..........
Packit 1422b7
Packit 1422b7
option.permitMismatchInParser
Packit 1422b7
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Packit 1422b7
If set to "True", permits repeat to accept as successful even when
Packit 1422b7
the parser processing failed. This by default is false, and can be
Packit 1422b7
set to true to cover some border cases, where the while part cannot
Packit 1422b7
definitely detect the end of processing. An example of such a border
Packit 1422b7
case is a listing of flags, being terminated by a double space where
Packit 1422b7
each flag is delimited by single spaces. For example, Cisco products
Packit 1422b7
generate such messages (note the flags part)::
Packit 1422b7
Packit 1422b7
    Aug 18 13:18:45 192.168.0.1 %ASA-6-106015: Deny TCP (no connection) from 10.252.88.66/443 to 10.79.249.222/52746 flags RST  on interface outside
Packit 1422b7
Packit 1422b7
cee-syslog
Packit 1422b7
##########
Packit 1422b7
This parses cee syslog from the message. This format has been defined
Packit 1422b7
by Mitre CEE as well as Project Lumberjack.
Packit 1422b7
Packit 1422b7
This format essentially is JSON with additional restrictions:
Packit 1422b7
Packit 1422b7
 * The message must start with "@cee:"
Packit 1422b7
 * an JSON **object** must immediately follow (whitespace before it permitted,
Packit 1422b7
   but a JSON array is **not** permitted)
Packit 1422b7
 * after the JSON, there must be no other non-whitespace characters.
Packit 1422b7
Packit 1422b7
In other words: the message must consist of a single JSON object only, 
Packit 1422b7
prefixed by the "@cee:" cookie.
Packit 1422b7
Packit 1422b7
Note that the cee cookie is case sensitive, so "@CEE:" is **NOT** valid.
Packit 1422b7
Packit 1422b7
Prefixes
Packit 1422b7
--------
Packit 1422b7
Packit 1422b7
Several rules can have a common prefix. You can set it once with this 
Packit 1422b7
syntax::
Packit 1422b7
Packit 1422b7
    prefix=<prefix match description>
Packit 1422b7
    
Packit 1422b7
Prefix match description syntax is the same as rule match description. 
Packit 1422b7
Every following rule will be treated as an addition to this prefix.
Packit 1422b7
Packit 1422b7
Prefix can be reset to default (empty value) by the line::
Packit 1422b7
Packit 1422b7
    prefix=
Packit 1422b7
Packit 1422b7
You can define a prefix for devices that produce the same header in each 
Packit 1422b7
message. We assume, that you have your rules sorted by device. In such a 
Packit 1422b7
case you can take the header of the rules and use it with the prefix 
Packit 1422b7
variable. Here is a example of a rule for IPTables (legacy format, to be converted later)::
Packit 1422b7
Packit 1422b7
    prefix=%date:date-rfc3164% %host:word% %tag:char-to:-\x3a%:
Packit 1422b7
    rule=:INBOUND%INBOUND:char-to:-\x3a%: IN=%IN:word% PHYSIN=%PHYSIN:word% OUT=%OUT:word% PHYSOUT=%PHYSOUT:word% SRC=%source:ipv4% DST=%destination:ipv4% LEN=%LEN:number% TOS=%TOS:char-to: % PREC=%PREC:word% TTL=%TTL:number% ID=%ID:number% DF PROTO=%PROTO:word% SPT=%SPT:number% DPT=%DPT:number% WINDOW=%WINDOW:number% RES=0x00 ACK SYN URGP=%URGP:number%
Packit 1422b7
Packit 1422b7
Usually, every rule would hold what is defined in the prefix at its 
Packit 1422b7
beginning. But since we can define the prefix, we can save that work in 
Packit 1422b7
every line and just make the rules for the log lines. This saves us a lot 
Packit 1422b7
of work and even saves space.
Packit 1422b7
Packit 1422b7
In a rulebase you can use multiple prefixes obviously. The prefix will be 
Packit 1422b7
used for the following rules. If then another prefix is set, the first one 
Packit 1422b7
will be erased, and new one will be used for the following rules.
Packit 1422b7
Packit 1422b7
Rule tags
Packit 1422b7
---------
Packit 1422b7
Packit 1422b7
Rule tagging capability permits very easy classification of syslog 
Packit 1422b7
messages and log records in general. So you can not only extract data from 
Packit 1422b7
your various log source, you can also classify events, for example, as 
Packit 1422b7
being a "login", a "logout" or a firewall "denied access". This makes it 
Packit 1422b7
very easy to look at specific subsets of messages and process them in ways 
Packit 1422b7
specific to the information being conveyed. 
Packit 1422b7
Packit 1422b7
To see how it works, let’s first define what a tag is:
Packit 1422b7
Packit 1422b7
A tag is a simple alphanumeric string that identifies a specific type of 
Packit 1422b7
object, action, status, etc. For example, we can have object tags for 
Packit 1422b7
firewalls and servers. For simplicity, let’s call them "firewall" and 
Packit 1422b7
"server". Then, we can have action tags like "login", "logout" and 
Packit 1422b7
"connectionOpen". Status tags could include "success" or "fail", among 
Packit 1422b7
others. Tags form a flat space, there is no inherent relationship between 
Packit 1422b7
them (but this may be added later on top of the current implementation). 
Packit 1422b7
Think of tags like the tag cloud in a blogging system. Tags can be defined 
Packit 1422b7
for any reason and need. A single event can be associated with as many 
Packit 1422b7
tags as required. 
Packit 1422b7
Packit 1422b7
Assigning tags to messages is simple. A rule contains both the sample of 
Packit 1422b7
the message (including the extracted fields) as well as the tags. 
Packit 1422b7
Have a look at this sample::
Packit 1422b7
Packit 1422b7
    rule=:sshd[%pid:number%]: Invalid user %user:word% from %src-ip:ipv4%
Packit 1422b7
Packit 1422b7
Here, we have a rule that shows an invalid ssh login request. The various 
Packit 1422b7
fields are used to extract information into a well-defined structure. Have 
Packit 1422b7
you ever wondered why every rule starts with a colon? Now, here is the 
Packit 1422b7
answer: the colon separates the tag part from the actual sample part. 
Packit 1422b7
Now, you can create a rule like this::
Packit 1422b7
Packit 1422b7
    rule=ssh,user,login,fail:sshd[%pid:number%]: Invalid user %user:word% from %src-ip:ipv4%
Packit 1422b7
Packit 1422b7
Note the "ssh,user,login,fail" part in front of the colon. These are the 
Packit 1422b7
four tags the user has decided to assign to this event. What now happens 
Packit 1422b7
is that the normalizer does not only extract the information from the 
Packit 1422b7
message if it finds a match, but it also adds the tags as metadata. Once 
Packit 1422b7
normalization is done, one can not only query the individual fields, but 
Packit 1422b7
also query if a specific tag is associated with this event. For example, 
Packit 1422b7
to find all ssh-related events (provided the rules are built that way), 
Packit 1422b7
you can normalize a large log and select only that subset of the 
Packit 1422b7
normalized log that contains the tag "ssh".
Packit 1422b7
Packit 1422b7
Log annotations
Packit 1422b7
---------------
Packit 1422b7
Packit 1422b7
In short, annotations allow to add arbitrary attributes to a parsed
Packit 1422b7
message, depending on rule tags. Values of these attributes are fixed,
Packit 1422b7
they cannot be derived from variable fields. Syntax is as following::
Packit 1422b7
Packit 1422b7
    annotate=<tag>:+<field name>="<field value>"
Packit 1422b7
Packit 1422b7
Field value should always be enclosed in double quote marks.
Packit 1422b7
Packit 1422b7
There can be multiple annotations for the same tag.
Packit 1422b7
Packit 1422b7
Examples
Packit 1422b7
--------
Packit 1422b7
Packit 1422b7
Look at :doc:`sample rulebase <sample_rulebase>` for configuration 
Packit 1422b7
examples and matching log lines. Note that the examples are currently
Packit 1422b7
in legacy format, only.