What: This document describes how to use bogofilter to filter mail that passes through a qmail mail server. Theory: The idea is to setup bogofilter on the mail server and have it filter all incoming mail. There are several advantages to doing so: 1. Mail users on non-unix platforms will benefit from bogofilter spam filtering 2. bogofilter learns better since it has access to a larger corpus. There is also a mechanism for users to register new spam/nonspam messages, as well as correcting misclassifications. Assumptions: - Most of the steps described here require root privileges. - qmail is installed in /var/qmail. If you installed qmail from rpm, it is probably installed in /usr/qmail. - You have installed Bruce Guenter's qmail-qfilter program which is available from: http://untroubled.org/qmail-qfilter - bogofilter is installed in /usr/bin/bogofilter on the mail server. Installation: - Create a /home/bogofilter directory if you don't already have one. Note that you may change the name and location, if you wish. Just remember that you did so! # mkdir /home/bogofilter - Change the ownership of /home/bogofilter to the qmail alias user, usually called 'alias' # chown -R alias:alias /home/bogofilter - Build the initial spam and nonspam databases by feeding your corpus of mail. Assuming that the files are in mbox format in /home/bogofilter, you say: # bogofilter -d /home/bogofilter -s < spam.mbx # bogofilter -d /home/bogofilter -n < nonspam.mbx Filtering: - Create a script to invoke bogofilter, say /home/bogofilter/bogofilter-run, modeled on the following: #!/bin/sh exec /usr/bin/qmail-qfilter /usr/bin/bogofilter -d /home/bogofilter -p -u -e Make sure the script is executable! Given a good initial corpus, it is better to have bogofilter update its lists based on the message classification, since it is quite likely to get it right. NOTE: Misclassifications MUST be corrected later, else the corpus will degrade and misclassifications can get worse over time. - Setup an /etc/tcpcontrol/qmail.rules entry to trigger the filtering. Ideally, you want to filter incoming mail, but not outgoing. in /etc/tcpcontrol/qmail.rules or its equivalent, enter the default rule: :allow,QMAILQUEUE="/home/bogofilter/bogofilter-run" - Now, every incoming message will have the header line X-Bogosity: ... added to the headers. A bogofilter classified spam messages will have the entry: X-Bogosity: Spam ... Note that the actual header name is configurable at compile time and may have been changed. - Educate your users on how to filter their spam based on the value of the X-Bogosity header. Spam messages should be diverted to a spam mailbox, rather than deleted. Registration and Correction: Setup an email alias, for users to register new spam or nonspam messages as well as to correct misclassification. - copy contrib/dot-qmail-bogofilter-default to /var/qmail/alias/.qmail-bogofilter-default If you are not using the alias 'bogofilter', you'll need to rename the file to your alias and edit the contents If you are not using /home/bogofilter, you'll need to edit the contents - copy contrib/bogofilter-qfe script to /home/bogofilter. Change the domain variable in the script to your domain. Let's assume example.com. The script will have to be edited if: 1. You have multiple domains 2. Your users use Mail User Agents I don't know about. Please send me your improvements! If you wish to save copies of the incoming messages, run the command # /var/qmail/bin/maildirmake /home/bogofilter/Maildir and to /var/qmail/alias/.qmail-bogofilter-default, append the line: /home/bogofilter/Maildir/ The trailing slash is important. Thanks to John Crawford for reminding me. - Make sure the script is executable # chmod +x /home/bogofilter/bogofilter-qfe - Educate your users to bounce their messages to: To register: spam, send to bogofilter-register-spam@example.com nonspam, send to bogofilter-register-nonspam@example.com To correct misclassified: spam, send to bogofilter-spam@example.com nonspam, send to bogofilter-nonspam@example.com It is important that users bounce or redirect the messages rather than forward them since forwarding loses important header information. - Done! NOTE: If bogofilter generates "out of memory" error messages, check the ulimits setting. TODO: A message that is delivered to a group alias will be filtered once, but may be corrected multiple times. Ideally, the script should only accept the first correction and ignore the rest. Author: Gyepi Sam