Blob Blame History Raw
'\" t
.\"     Title: bogoutil
.\"    Author: [see the "AUTHOR" section]
.\" Generator: DocBook XSL Stylesheets vsnapshot <http://docbook.sf.net/>
.\"      Date: 05/19/2019
.\"    Manual: Bogofilter Reference Manual
.\"    Source: Bogofilter
.\"  Language: English
.\"
.TH "BOGOUTIL" "1" "05/19/2019" "Bogofilter" "Bogofilter Reference Manual"
.\" -----------------------------------------------------------------
.\" * Define some portability stuff
.\" -----------------------------------------------------------------
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.\" http://bugs.debian.org/507673
.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.\" -----------------------------------------------------------------
.\" * set default formatting
.\" -----------------------------------------------------------------
.\" disable hyphenation
.nh
.\" disable justification (adjust text to left margin only)
.ad l
.\" -----------------------------------------------------------------
.\" * MAIN CONTENT STARTS HERE *
.\" -----------------------------------------------------------------
.SH "NAME"
bogoutil \- Dumps, loads, and maintains bogofilter database files
.SH "SYNOPSIS"
.HP \w'\fBbogoutil\fR\ 'u
\fBbogoutil\fR {\-h | \-V}
.HP \w'\fBbogoutil\fR\ 'u
\fBbogoutil\fR [options] {\-d\ \fIfile\fR | \-H\ \fIfile\fR | \-l\ \fIfile\fR | \-m\ \fIfile\fR | \-w\ \fIfile\fR | \-p\ \fIfile\fR}
.HP \w'\fBbogoutil\fR\ 'u
\fBbogoutil\fR {\-r\ \fIfile\fR | \-R\ \fIfile\fR}
.HP \w'\fBbogoutil\fR\ 'u
\fBbogoutil\fR {\-\-db\-print\-leafpage\-count\ \fIfile\fR | \-\-db\-print\-pagesize\ \fIfile\fR | \-\-db\-verify\ \fIfile\fR | \-\-db\-checkpoint\ \fIdirectory\fR\ [flag...]  | \-\-db\-list\-logfiles\ \fIdirectory\fR | \-\-db\-prune\ \fIdirectory\fR | \-\-db\-recover\ \fIdirectory\fR | \-\-db\-recover\-harder\ \fIdirectory\fR | \-\-db\-remove\-environment\ \fIdirectory\fR}
.PP
where
\fBoptions\fR
is
.HP \w'\fBbogoutil\fR\ 'u
\fBbogoutil\fR [\-v] [\-n] [\-C] [\-D] [\-a\ \fIage\fR] [\-c\ \fIcount\fR] [\-s\ \fImin,max\fR] [\-y\ \fIdate\fR] [\-I\ \fIfile\fR] [\-O\ \fIfile\fR] [\-x\ \fIflags\fR] [\-\-config\-file\ \fIfile\fR]
.SH "DESCRIPTION"
.PP
Bogoutil
is part of the
bogofilter
Bayesian spam filter package\&.
.PP
It is used to dump and load
bogofilter\*(Aqs Berkeley DB databases to and from text files, perform database maintenance functions, and to display the values for specific words\&.
.SH "OPTIONS"
.PP
The
\fB\-d \fR\fB\fIfile\fR\fR
option tells
bogoutil
to print the contents of the database file to
\fBstdout\fR\&.
.PP
The
\fB\-H \fR\fB\fIfile\fR\fR
option tells
bogoutil
to print a histogram of the database file to
\fBstdout\fR\&. The output is similar to
bogofilter \-vv\&. Finally, hapaxes (tokens which were only seen once) and pure tokens (tokens which were encountered only in ham or only in spam) are counted\&.
.PP
The
\fB\-l \fR\fB\fIfile\fR\fR
option tells
bogoutil
to load the data from
\fBstdin\fR
into the database file\&. If the database file exists,
\fBstdin\fR
data is merged into the database file, with counts added up\&.
.PP
The
\fB\-m\fR
option tells
bogoutil
to perform maintenance functions on the specified database, i\&.e\&. discard tokens that are older than desired, have counts that are too small, or sizes (lengths) that are too long or too short\&.
.PP
The
\fB\-w \fR\fB\fIfile\fR\fR
option tells
bogoutil
to display token information from the database file\&. The option takes an argument, which is either the name of the wordlist (usually wordlist\&.db) or the name of the directory containing it\&. Tokens can be listed on the command line or piped to
bogoutil\&. When there are extra arguments on the command line,
bogoutil
will use them as the tokens to lookup\&. If there are no extra arguments,
bogoutil
will read tokens from
\fBstdin\fR\&.
.PP
The
\fB\-p \fR\fB\fIfile\fR\fR
option tells
bogoutil
to display the database information for one or more tokens\&. The display includes a probability column with the token\*(Aqs spam score (computed using
bogofilter\*(Aqs default values)\&. Option
\fB\-p\fR
takes the same arguments as option
\fB\-w\fR
\&.
.PP
The
\fB\-r \fR\fB\fIfile\fR\fR
option tells
bogoutil
to recalculate the ROBX value and print it as a six\-digit fraction\&.
.PP
The
\fB\-R \fR\fB\fIfile\fR\fR
option does the same as
\fB\-r\fR, but saves the result in the training database without printing it\&.
.PP
The
\fB\-I \fR\fB\fIfile\fR\fR
option tells
bogoutil
to read its input from
\fIfile\fR
rather than stdin\&.
.PP
The
\fB\-O \fR\fB\fIfile\fR\fR
option tells
bogoutil
to write its output to
\fIfile\fR
rather than stdout\&.
.PP
The
\fB\-v\fR
option produces verbose output on
\fBstderr\fR\&. This option is primarily useful for debugging\&.
.PP
The
\fB\-C\fR
inhibits reading configuration files and lets
bogoutil
go with the defaults\&.
.PP
The
\fB\-\-config\-file \fR\fB\fIfile\fR\fR
option tells
bogoutil
to read
\fIfile\fR
instead of the standard configuration file\&.
.PP
The
\fB\-D\fR
redirects debug output to stdout (it usually goes to stderr)\&.
.PP
The
\fB\-x \fR\fB\fIflags\fR\fR
option sets debugging flags\&.
.PP
Option
\fB\-n\fR
stands for "replace non\-ascii characters"\&. It will replace characters with the high bit (0x80) by question marks\&. This can be useful if a word list has lots of unreadable tokens, for example from Asian spam\&. The "bad" characters will be converted to question marks and matching tokens will be combined when used with
\fB\-m\fR
or
\fB\-l\fR, but not with
\fB\-d\fR\&.
.PP
Option
\fB\-a age\fR
indicates an acceptable token age, with older ones being discarded\&. The age can be a date (in form YYYYMMMDD) or a day count, i\&.e\&. discard tokens older than
\fBage\fR
days\&.
.PP
Option
\fB\-c value\fR
indicates that tokens with counts less than or equal to
\fBvalue\fR
are to be discarded\&.
.PP
Option
\fB\-s min,max\fR
is used to discard tokens based on their size, i\&.e\&. length\&. All tokens shorter than
\fBmin\fR
or longer than
\fBmax\fR
will be discarded\&.
.PP
Option
\fB\-y date\fR
is specifies the date to give to tokens that don\*(Aqt have dates\&. The format is YYYYMMDD\&.
.PP
The
\fB\-h\fR
option prints the help message and exits\&.
.PP
The
\fB\-V\fR
option prints the version number and exits\&.
.SH "ENVIRONMENT MAINTENANCE"
.PP
The
\fB\-\-db\-checkpoint \fR\fB\fIdir\fR\fR
option causes
bogoutil
to flush the buffer caches and checkpoint the database environment\&.
.PP
The
\fB\-\-db\-list\-logfiles \fR\fB\fIdir\fR\fR
option causes
bogoutil
to list the log files in the environment\&. Zero or more keywords can be added or combined (separated by whitespace) to modify the behavior of this mode\&. The default behavior is to list only inactive log files with relative paths\&. You can add
\fBall\fR
to list all log files (inactive and active)\&. You can add
\fBabsolute\fR
to switch the listing to absolute paths\&.
.PP
The
\fB\-\-db\-prune \fR\fB\fIdir\fR\fR
option causes
bogoutil
to checkpoint the database environment and remove inactive log files\&.
.PP
The
\fB\-\-db\-recover \fR\fB\fIdir\fR\fR
option runs a regular database recovery in the specified database directory\&. If that fails, it will retry with a (usually slower) catastrophic database recovery\&. If that fails, too, your database cannot be repaired and must be rebuilt from scratch\&. This is only supported when compiled with Berkeley DB support with transactions enabled\&. Trying recovery with QDBM or SQLite3 support will result in an error\&.
.PP
The
\fB\-\-db\-recover\-harder \fR\fB\fIdir\fR\fR
option runs a catastrophic data base recovery in the specified database directory\&. If that fails, your database cannot be repaired and must be rebuilt from scratch\&. This is only supported when compiled with Berkeley DB support with transactions enabled\&. Trying recovery with QDBM or SQLite3 support will result in an error\&.
.PP
The
\fB\-\-db\-remove\-environment \fR\fB\fIdirectory\fR\fR
option has no short option equivalent\&. It runs recovery in the given directory and then removes the database environment\&. Use this
\fIbefore\fR
upgrading to a new Berkeley DB version if the new version to be installed requires a log file format update\&.
.PP
The
\fB\-\-db\-print\-leafpage\-count \fR\fB\fIfile\fR\fR
option prints the number of leaf pages in the database file
\fIfile\fR
as a decimal number, or UNKNOWN if the database does not support querying this figure\&.
.PP
The
\fB\-\-db\-print\-pagesize \fR\fB\fIfile\fR\fR
option prints the size of a database page in
\fIfile\fR
as a decimal number, or UNKNOWN for databases with variable page size or databases that do not allow a query of the database page size\&.
.PP
The
\fB\-\-db\-verify \fR\fB\fIfile\fR\fR
option requests that
bogofilter
verifies the database file\&. It prints only errors, unless in verbose mode\&.
.SH "DATA FORMAT"
.PP
Bogoutil
reads and writes text files where each nonblank line consists of a word, any amount of horizontal whitespace, a numeric word count, more whitespace, and (optionally) a date in form YYYYMMDD\&. Blank lines are skipped\&.
.SH "RETURN VALUES"
.PP
0 for successful operation\&. 1 for most errors\&. 3 for I/O or other errors\&. Error 3 usually means that something is seriously wrong with the database files\&.
.SH "AUTHOR"
.PP
Gyepi Sam
<gyepi@praxis\-sw\&.com>\&.
.PP
Matthias Andree
<matthias\&.andree@gmx\&.de>\&.
.PP
David Relson
<relson@osagesoftware\&.com>\&.
.PP
For updates, see
\m[blue]\fBthe bogofilter project page\fR\m[]\&\s-2\u[1]\d\s+2\&.
.SH "SEE ALSO"
.PP
bogofilter(1), bogolexer(1), bogotune(1), bogoupgrade(1)
.SH "NOTES"
.IP " 1." 4
the bogofilter project page
.RS 4
\%http://bogofilter.sourceforge.net/
.RE