maij home

MaiJ - Poo Technology Spamfilter


$Id: Poo,v 2.0.8 alpha 2004/04/19 12:03:01 smisch Exp $

Poo Technology is MaiJ's internal content negotiation system. It is used for filtering and spam filtering purposes.

Project

The funny name derives from the MaiJ-Button "This is poo". But the main goal of this project was to develope a content negotiation system, which can automatically determine the destination box without the need to define email filters. The algorithm is based on Bayes Formula, the popular machine learning and content processing algorithm.

Besides the ability of determining the destination folder by content it is also good for spam filtering purposes.

Training

The algorithm has to be trained to work properly. In figure 1.1 you can see a plot of iterated trainings (0 - 100 trainees) and the results for differencing between "positive" and "negative" emails (400 each iteration).

After only about 60 trainees the filter works with over 90% rate.


figure 1.1

Download You can download my current Poo-Brain to act as a spamfilter with MaiJ. If you're not from germany or receive non-us spam it may be better to train your MaiJ yourself.
My current poo-brain - 17K, 196 trainees, 99,6% rate

Initial brains are included in every MaiJ build.

Bookmarks These bookmarks are external links. No warranty for content. Alphabetical order.

http://www.cs.ccsu.edu/~markov/ccsu_courses/nbc.ps