What:
This document describes how to use bogofilter to filter mail that passes
through a qmail mail server.
Theory:
The idea is to setup bogofilter on the mail server and have it filter all
incoming mail.
There are several advantages to doing so:
1. Mail users on non-unix platforms will benefit from bogofilter spam filtering
2. bogofilter learns better since it has access to a larger corpus.
There is also a mechanism for users to register new spam/nonspam messages, as
well as correcting misclassifications.
Assumptions:
- Most of the steps described here require root privileges.
- qmail is installed in /var/qmail. If you installed qmail from rpm, it is
probably installed in /usr/qmail.
- You have installed Bruce Guenter's qmail-qfilter program which is available
from:
http://untroubled.org/qmail-qfilter
- bogofilter is installed in /usr/bin/bogofilter on the mail server.
Installation:
- Create a /home/bogofilter directory if you don't already have one.
Note that you may change the name and location, if you wish.
Just remember that you did so!
# mkdir /home/bogofilter
- Change the ownership of /home/bogofilter to the qmail alias user, usually
called 'alias'
# chown -R alias:alias /home/bogofilter
- Build the initial spam and nonspam databases by feeding your corpus of mail.
Assuming that the files are in mbox format in /home/bogofilter, you say:
# bogofilter -d /home/bogofilter -s < spam.mbx
# bogofilter -d /home/bogofilter -n < nonspam.mbx
Filtering:
- Create a script to invoke bogofilter, say /home/bogofilter/bogofilter-run,
modeled on the following:
#!/bin/sh
exec /usr/bin/qmail-qfilter /usr/bin/bogofilter -d /home/bogofilter -p -u -e
Make sure the script is executable!
Given a good initial corpus, it is better to have bogofilter update its
lists based on the message classification, since it is quite likely to get
it right.
NOTE: Misclassifications MUST be corrected later, else the corpus will
degrade and misclassifications can get worse over time.
- Setup an /etc/tcpcontrol/qmail.rules entry to trigger the filtering.
Ideally, you want to filter incoming mail, but not outgoing.
in /etc/tcpcontrol/qmail.rules or its equivalent, enter the default rule:
:allow,QMAILQUEUE="/home/bogofilter/bogofilter-run"
- Now, every incoming message will have the header line
X-Bogosity: ...
added to the headers.
A bogofilter classified spam messages will have the entry:
X-Bogosity: Spam ...
Note that the actual header name is configurable at compile time and may
have been changed.
- Educate your users on how to filter their spam based on the value of the
X-Bogosity header.
Spam messages should be diverted to a spam mailbox, rather than deleted.
Registration and Correction:
Setup an email alias, for users to register new spam or nonspam messages
as well as to correct misclassification.
- copy contrib/dot-qmail-bogofilter-default to
/var/qmail/alias/.qmail-bogofilter-default
If you are not using the alias 'bogofilter', you'll need to rename the file
to your alias and edit the contents
If you are not using /home/bogofilter, you'll need to edit the contents
- copy contrib/bogofilter-qfe script to /home/bogofilter.
Change the domain variable in the script to your domain. Let's assume
example.com. The script will have to be edited if:
1. You have multiple domains
2. Your users use Mail User Agents I don't know about.
Please send me your improvements!
If you wish to save copies of the incoming messages, run the command
# /var/qmail/bin/maildirmake /home/bogofilter/Maildir
and to /var/qmail/alias/.qmail-bogofilter-default, append the line:
/home/bogofilter/Maildir/
The trailing slash is important. Thanks to John Crawford for reminding me.
- Make sure the script is executable
# chmod +x /home/bogofilter/bogofilter-qfe
- Educate your users to bounce their messages to:
To register:
spam, send to bogofilter-register-spam@example.com
nonspam, send to bogofilter-register-nonspam@example.com
To correct misclassified:
spam, send to bogofilter-spam@example.com
nonspam, send to bogofilter-nonspam@example.com
It is important that users bounce or redirect the messages rather than
forward them since forwarding loses important header information.
- Done!
NOTE:
If bogofilter generates "out of memory" error messages, check the ulimits
setting.
TODO:
A message that is delivered to a group alias will be filtered once, but may
be corrected multiple times. Ideally, the script should only accept the first
correction and ignore the rest.
Author:
Gyepi Sam <gyepi@praxis-sw.com>