University of Helsinki Department of Computer Science
 

Department of Computer Science

Department information

 

Suomeksi In English

Bogofilter, the spam filter

At a very conservative estimate, the Computer Science Department e-mail system manages to filter c. 800,000 - 1,000,000 unwanted e-mail messages, aka spam, per week with filtering based on IP addresses. Thus, spam forms over 80% of all incoming mail at the moment.

An estimated 100,000 spam messages pass through the filters each week. This means that many users get tens, some even hundreds, of spam messages per week.

Earlier, we rid ourselves of these messages that passed through the filters with a rule-based filter, which recognizes several phrases that typically occur in spam, or typical mistakes in the subject field of spam messages. Unfortunately, this method has not been efficient or reliable enough.

Since spring 2003, the CS Dept. has been using a system called bogofilter, which is based on statistical analysis of e-mail messages. It computes an index for each message describing the probability of it being spam. Each message is provided with an extra title looking something like this: X-Bogosity: Yes, tests=bogofilter, spamicity=0.988761, version=0.12.3

The title name X-Bogosity is always followed by an estimate of whether the message is spam or not. The values given are "Yes", "No" or "Unsure". Then comes a list of tests (bogofilter), the actual index value (spamicity) and the software version number.

The best way to utilize the filter in all mail readers is to set a mail rule which moves all messages with the value "X-Bogosity: Yes" into a certain folder without going through the inbox. The folder can be a Trash folder, which most mail readers create automatically and from which our mail system automatically deletes any messages that are over a week old. Another possibility is to create a separate Spam folder. This can be done in the department Webmail (SqWebMail), on the Folders page by clicking on "Create new folder", but it is fairly simple to do with any mail reader.

Redirecting spam with the Webmail user interface:

  1. Go to "Edit Mail Filters" from the link at the top.
  2. Fill the following information into the "Edit/Add mail filters" form:
    • Rule name: choose your own, e.g. "Spam"
    • Click on "Condition: Header" and write "X-Bogosity" in the field
    • Choose "starts with" from the drop-down menu and write "Yes" in the field
    • Click on "Action: Save in" and choose the folder you want from the drop-down menu; DO NOT tick the box "and continue filtering".
  3. Click on "Submit".
  4. Click on "Save all changes" at the top of the page.
  5. You can log out by clicking on "Log out" at the top of the page.

The redirection will start immediately and all messages that seem like spam will be redirected to the folder you chose. It may be a good idea to take a look into the folder every now and then, because the bogofilter sometimes makes mistakes and legitimate e-mail may end up in the folder.

At a rough estimate, the bogofilter can filter 95% of the spam, i.e. some spam will still reach your inbox. In combination with the 90% effect of the IP filtering, we can assume that c. 99.5% of the spam can be averted.


postmaster@cs.helsinki.fi