Blame GETTING.STARTED

Packit e8bc57
			   GETTING STARTED
Packit e8bc57
Packit e8bc57
Summary:
Packit e8bc57
Packit e8bc57
  0. Terminology
Packit e8bc57
  1. Installing Bogofilter
Packit e8bc57
  2. Preparing for use
Packit e8bc57
     a. Configuring bogofilter
Packit e8bc57
     b. Training bogofilter
Packit e8bc57
  3. Setting up the mail transfer and delivery agents
Packit e8bc57
  4. Use with mail user agent
Packit e8bc57
  5. Ongoing training
Packit e8bc57
  6. Tuning bogofilter
Packit e8bc57
  7. The bogoutil program
Packit e8bc57
  8. Other useful commands
Packit e8bc57
  9. Additional information
Packit e8bc57
Packit e8bc57
0. Terminology
Packit e8bc57
--------------
Packit e8bc57
Packit e8bc57
    spam - unwanted email
Packit e8bc57
    ham  - wanted mail (also called non-spam)
Packit e8bc57
    false positive - a ham message that is wrongly scored as spam
Packit e8bc57
    false negative - a spam message that is wrongly scored as ham
Packit e8bc57
Packit e8bc57
1. Installing Bogofilter
Packit e8bc57
------------------------
Packit e8bc57
Packit e8bc57
    Bogofilter can be installed from source or from a binary package.
Packit e8bc57
    Releases are made available on SourceForge.net.
Packit e8bc57
Packit e8bc57
    If you're a newbie, installing from a binary package is quickest
Packit e8bc57
    and easiest.  If you're running an rpm based distro like Fedora
Packit e8bc57
    or OpenSUSE, install bogofilter from an rpm.  Similarly if you're
Packit e8bc57
    running Debian, Mint, or Ubuntu, install from a deb package.
Packit e8bc57
Packit e8bc57
    Once downloaded and untarred, build and install with the usual 
Packit e8bc57
    commands, i.e.  "configure", "make", and "make install".  To ensure 
Packit e8bc57
    that the newly built bogofilter is running properly on your hardware 
Packit e8bc57
    and operating system, use "make check" to run a series of tests.  
Packit e8bc57
Packit e8bc57
    For source rpms, use "rpm -bb bogofilter.spec" and "rpm -ivh
Packit e8bc57
    bogofilter" (or comparable commands).
Packit e8bc57
Packit e8bc57
    Binary formats include builds for dynamically linked (shared)
Packit e8bc57
    libaries, e.g. bogofilter-VER.x64_64.rpm.
Packit e8bc57
Packit e8bc57
    See the INSTALL file for more info.
Packit e8bc57
Packit e8bc57
2. Preparing for use
Packit e8bc57
--------------------
Packit e8bc57
Packit e8bc57
    Once bogofilter has been installed, it needs to be configured and
Packit e8bc57
    trained, i.e. given messages that you classify as spam and ham.
Packit e8bc57
Packit e8bc57
    2a. Configuring bogofilter
Packit e8bc57
    --------------------------
Packit e8bc57
Packit e8bc57
    Bogofilter's default configuration is conservative, i.e. only
Packit e8bc57
    messages that score very high on the ham/spam scale are classified
Packit e8bc57
    as spam.  This is done to minimize the number of false positives
Packit e8bc57
    (non-spam messages which are classified as spam).
Packit e8bc57
Packit e8bc57
    If you need (or wish) to change bogofilter's configuration
Packit e8bc57
    options, the file is named "bogofilter.cf" and bogofilter first
Packit e8bc57
    checks for /etc/bogofilter.cf and then for
Packit e8bc57
    ~/.bogofilter/bogofilter.cf.  The configuration options are
Packit e8bc57
    described in file bogofilter.cf.example.
Packit e8bc57
Packit e8bc57
    2b. Training bogofilter
Packit e8bc57
    -----------------------
Packit e8bc57
  
Packit e8bc57
    Bogofilter uses a database for storing its tokens and their ham
Packit e8bc57
    and spam counts.  The file is commonly called "the wordlist" and
Packit e8bc57
    its standard location is ~/.bogofilter/wordlist.db.
Packit e8bc57
Packit e8bc57
    The simple rule when training bogofilter is "more is better".
Packit e8bc57
Packit e8bc57
    As distributed, bogofilter does not include a wordlist.  You, the
Packit e8bc57
    user, need to tell bogofilter what you consider spam and what you
Packit e8bc57
    consider ham.  This is bogofilter's training process and involves
Packit e8bc57
    running bogofilter with appropriate flags and with messages you've
Packit e8bc57
    determined are ham and spam.  As bogofilter can work with multiple
Packit e8bc57
    mail formats, e.g. mailboxes, maildirs, MH directories, etc, the
Packit e8bc57
    training commands will depend on your environment.
Packit e8bc57
Packit e8bc57
    As the default wordlist directory is $HOME/.bogofilter, the
Packit e8bc57
    wordlist itself will be in $HOME/.bogofilter/wordlist.db.  For
Packit e8bc57
    user john, this is /home/john/.bogofilter/wordlist.db.
Packit e8bc57
Packit e8bc57
    Some useful options for training include:
Packit e8bc57
Packit e8bc57
	  -s - register message(s) as spam.
Packit e8bc57
	  -n - register message(s) as non-spam.
Packit e8bc57
	  -M - use mailbox mode, i.e. classify multiple messages in an
Packit e8bc57
               mbox formatted file.
Packit e8bc57
	  -B file1, file2, ... - set bulk mode, i.e. process multiple
Packit e8bc57
               messages (files or directories) named on the command
Packit e8bc57
               line.
Packit e8bc57
	  -v - sets the verbosity level, with the -s and -n training
Packit e8bc57
	       options, this will give the number of messages read and
Packit e8bc57
	       words entered in wordlist.db
Packit e8bc57
Packit e8bc57
    These options are documented in the bogofilter man page.
Packit e8bc57
Packit e8bc57
    Here are some sample commands:
Packit e8bc57
Packit e8bc57
       bogofilter -vn < ham.message.file
Packit e8bc57
       bogofilter -vnM 
Packit e8bc57
       bogofilter -vnMB ham.maildir
Packit e8bc57
Packit e8bc57
       bogofilter -vs < spam.message.file
Packit e8bc57
       bogofilter -vsM 
Packit e8bc57
       bogofilter -vsMB spam.maildir
Packit e8bc57
Packit e8bc57
3. Setting up the mail transfer and delivery agents
Packit e8bc57
---------------------------------------------------
Packit e8bc57
Packit e8bc57
    Bogofilter works with many mail transfer agents (such as postfix,
Packit e8bc57
    sendmail, and qmail) and many mail delivery agents (for example
Packit e8bc57
    procmail and maildrop).  Each of these has its own configuration
Packit e8bc57
    file and methods for invoking spam filters.  Bogofilter's
Packit e8bc57
    documentation includes files "integrating-with-postfix" and
Packit e8bc57
    "integrating-with-qmail".  Read them for ideas on how to set up
Packit e8bc57
    bogofilter for your environment.
Packit e8bc57
Packit e8bc57
    The most common setup uses bogofilter's "-p" (passthrough) option
Packit e8bc57
    which adds an "X-Bogosity:" line as the end of the message's mail
Packit e8bc57
    header.  Typical examples of this line are:
Packit e8bc57
Packit e8bc57
     (for spam)
Packit e8bc57
       X-Bogosity: Spam, tests=bogofilter, spamicity=1.000000, version=0.92.8
Packit e8bc57
       X-Bogosity: Spam, tests=bogofilter, spamicity=0.999765, version=0.92.8
Packit e8bc57
Packit e8bc57
     (for non-spam)
Packit e8bc57
       X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=0.92.8
Packit e8bc57
       X-Bogosity: Ham, tests=bogofilter, spamicity=0.000413, version=0.92.8
Packit e8bc57
       X-Bogosity: Ham, tests=bogofilter, spamicity=0.373476, version=0.92.8
Packit e8bc57
Packit e8bc57
     (for "unsures")
Packit e8bc57
       X-Bogosity: Unsure, tests=bogofilter, spamicity=0.500332, version=0.92.8
Packit e8bc57
       X-Bogosity: Unsure, tests=bogofilter, spamicity=0.463498, version=0.92.8
Packit e8bc57
       X-Bogosity: Unsure, tests=bogofilter, spamicity=0.640426, version=0.92.8
Packit e8bc57
       X-Bogosity: Unsure, tests=bogofilter, spamicity=0.824933, version=0.92.8
Packit e8bc57
Packit e8bc57
    Alternatively, bogofilter's return codes can be used by procmail
Packit e8bc57
    (or maildrop) rules to put spam in one mailbox and ham in another.
Packit e8bc57
Packit e8bc57
4. Use with mail user agent
Packit e8bc57
---------------------------
Packit e8bc57
Packit e8bc57
    Bogofilter is compatible with all mail user agents.  MUAs with
Packit e8bc57
    filtering abilities can check the headers for "X-Bogosity: Spam"
Packit e8bc57
    and "X-Bogosity: Ham" and take the appropriate actions for spam and
Packit e8bc57
    ham.
Packit e8bc57
Packit e8bc57
    Alternatively, if your MUA has sufficient scripting capabilities,
Packit e8bc57
    the MUA can run bogofilter and take the appropriate action.
Packit e8bc57
Packit e8bc57
    As time goes by and bogofilter encounters messages that it can not
Packit e8bc57
    classify with certainty, there will be messages classified as
Packit e8bc57
    "Unsure".  As these messages are in the "gray" area, meaning "not
Packit e8bc57
    clearly ham and not clearly spam" it's useful to have your MUA
Packit e8bc57
    filter these messages to a separate folder (or mailbox) so you can
Packit e8bc57
    use them to
Packit e8bc57
Packit e8bc57
    train bogofilter.
Packit e8bc57
Packit e8bc57
5. Ongoing training
Packit e8bc57
-------------------
Packit e8bc57
    
Packit e8bc57
    Bogofilter can only do a good job if it has accurate and
Packit e8bc57
    comprehensive information in its wordlist.
Packit e8bc57
Packit e8bc57
    As time goes by and bogofilter classifies messages for you, it
Packit e8bc57
    will encounter problems because it does not have enough information
Packit e8bc57
    to correctly classify each and every message.  It's important to
Packit e8bc57
    check message classifications!
Packit e8bc57
Packit e8bc57
    "False negatives", i.e. spam classified as ham, are easy since
Packit e8bc57
    they'll appear in your inbox and be noticed.  "False positives"
Packit e8bc57
    are important to find because they're messages you want!  All
Packit e8bc57
    messages in these groups should be used to train bogofilter.
Packit e8bc57
Packit e8bc57
    Filtering "Unsure" messages into a separate folder (or mailbox),
Packit e8bc57
    and manually classifying and separating them into spam and ham,
Packit e8bc57
    gives a good set of messages for training (using bogofilter's "-s"
Packit e8bc57
    and "-n" flags).
Packit e8bc57
Packit e8bc57
    Bogofilter's FAQ has two entries that provide additional info:
Packit e8bc57
Packit e8bc57
       How do I start my bogofilter training?"
Packit e8bc57
       What are "training on error" and "training to exhaustion"?
Packit e8bc57
Packit e8bc57
    The FAQ can be online in English and French at:
Packit e8bc57
Packit e8bc57
       http://bogofilter.sourceforge.net/bogofilter-faq.html
Packit e8bc57
       http://bogofilter.sourceforge.net/bogofilter-faq-fr.html
Packit e8bc57
Packit e8bc57
6. Tuning bogofilter
Packit e8bc57
--------------------
Packit e8bc57
Packit e8bc57
    Once you've use bogofilter for a while, you may wish to optimize
Packit e8bc57
    its classification parameters.  The bogotune utility uses your
Packit e8bc57
    wordlist and additional ham and spam messages to check a large
Packit e8bc57
    variety of possible parameter values and find what'll work best
Packit e8bc57
    for your environment.  For more info, read the bogotune man page
Packit e8bc57
    and file bogofilter-tuning.HOWTO.html.
Packit e8bc57
Packit e8bc57
7. The bogoutil program
Packit e8bc57
-----------------------
Packit e8bc57
Packit e8bc57
    Bogoutil is a program that allows dumping the wordlist (as a text
Packit e8bc57
    file), loading the wordlist (from a text file), displaying
Packit e8bc57
    information about individual words, etc.
Packit e8bc57
Packit e8bc57
    Here are some sample uses of it:
Packit e8bc57
Packit e8bc57
    To display the wordlist contents:
Packit e8bc57
       bogoutil -d ~/.bogofilter/wordlist.db
Packit e8bc57
Packit e8bc57
    To display the message counts for a word:
Packit e8bc57
       bogoutil -w ~/.bogofilter .MSG_COUNT
Packit e8bc57
Packit e8bc57
8. Other useful commands
Packit e8bc57
------------------------
Packit e8bc57
Packit e8bc57
    To test scoring of individual words:
Packit e8bc57
Packit e8bc57
       echo show these words | bogofilter -H -vvv
Packit e8bc57
     or:
Packit e8bc57
       bogoutil -p ~/.bogofilter show these words
Packit e8bc57
Packit e8bc57
    To see the tokens and their spamicity scores for a message:
Packit e8bc57
Packit e8bc57
      bogofilter -vvv < message
Packit e8bc57
Packit e8bc57
9. Additional information
Packit e8bc57
-------------------------
Packit e8bc57
Packit e8bc57
    Bogofilter's distribution includes a number of files containing
Packit e8bc57
    more information.  You'll find them in /usr/share/doc (or
Packit e8bc57
    comparable location).  The following files are included:
Packit e8bc57
Packit e8bc57
    FAQs:
Packit e8bc57
Packit e8bc57
          English - bogofilter-faq.html
Packit e8bc57
	  French  - bogofilter-faq-fr.html
Packit e8bc57
Packit e8bc57
    General:
Packit e8bc57
Packit e8bc57
	INSTALL
Packit e8bc57
	NEWS
Packit e8bc57
	README
Packit e8bc57
	RELEASE.NOTES
Packit e8bc57
Packit e8bc57
    Man pages:
Packit e8bc57
Packit e8bc57
	bogofilter
Packit e8bc57
	bogolexer
Packit e8bc57
	bogoutil
Packit e8bc57
	bogotune
Packit e8bc57
	bogoupgrade
Packit e8bc57
	(also distributed in html and xml formats)
Packit e8bc57
Packit e8bc57
    HOWTOS:
Packit e8bc57
Packit e8bc57
	bogofilter-tuning.HOWTO.html
Packit e8bc57
	integrating-with-postfix
Packit e8bc57
	integrating-with-qmail
Packit e8bc57
Packit e8bc57
    Operating System specific README files:
Packit e8bc57
Packit e8bc57
	README.freebsd
Packit e8bc57
	README.hp-ux
Packit e8bc57
	README.RISC-OS