Blame doc/internals.rst

Packit 1422b7
Liblognorm internals
Packit 1422b7
====================
Packit 1422b7
Packit 1422b7
Parse-tree
Packit 1422b7
----------
Packit 1422b7
Packit 1422b7
A parse-tree is generated each time when normalization process is set up.
Packit 1422b7
Packit 1422b7
You could also call it a optimized rulebase. Each message runs through 
Packit 1422b7
this tree consisting of parsers and fields and will be compared to it. The 
Packit 1422b7
message can either fit into a branch or not. If it fits, it can be 
Packit 1422b7
normalized. If it does not fit any branch in the tree, then a fitting 
Packit 1422b7
sample has to be created for this message.
Packit 1422b7
 
Packit 1422b7
The tree is built from branches. These branches consist of 3 things: 
Packit 1422b7
nodes, paths and parser.
Packit 1422b7
Packit 1422b7
A node is typically a literal part from a message where either a parser 
Packit 1422b7
follows or there are several subsequent literals which are different, so 
Packit 1422b7
one of the paths must be selected. After a parser, a node will always 
Packit 1422b7
follow. Parsers are like variables and thus the core structure of a 
Packit 1422b7
sample. With these a property field can be filled, which in the end is 
Packit 1422b7
needed to normalize the message. 
Packit 1422b7
Packit 1422b7
A few notes on optimization of a parse-tree.
Packit 1422b7
Packit 1422b7
A parse-tree is always optimized, whether or not the samples of a similar 
Packit 1422b7
kind are next to each other or not. Even if you make the order totally 
Packit 1422b7
random, it should always result in the same parse-tree. Therefore, no 
Packit 1422b7
optimization efforts have to be made to the tree itself. It reuses 
Packit 1422b7
equivalent prefixes of messages which are already in the tree. Only if a 
Packit 1422b7
difference occurs, then a new node must follow. 
Packit 1422b7
Packit 1422b7
One case where rule order can be significant is when a message can match
Packit 1422b7
two or more different rules. This can occur when the rules differ in
Packit 1422b7
parsers. If in doubt, use :doc:`lognormalizer <lognormalizer>` tool to 
Packit 1422b7
debug.