A project called ifile, found via Sweetcode, that uses Naive Bayes to classify e-mail documents. This is the same technique that Paul Graham has recently written about. Paul's write up was talked about on Slashdot.
It seems to be a pretty useful method for e-mail classification. Some of the Slashdot posters preferred the systems where certain phrases or keywords are manually given scores and the aggregate score for a message is used to classify it. I think that the Naive Bayes method is likely to be more effective in practice as it requires less work from the user of the system. All they have to do is to provide their classification for messages that are not automatically classified correctly, which is easier than having to isolate and score the phrase or pattern that identifies the spam manually.
Posted by Alex at August 19, 2002 09:33 AM