Author/Date Bug (?) in the Bayesian options + questions
06/11/2003 8:42am
Using Test44:
Bayesian Controls:
- Setting a folder name that does
not exist gives no warning.
- Tagging a mail as spam when folder does
not exist marks the mail as read, but
nothing else seems to happend.

I have 4000+ spam-mails in my Mozilla
folders (which nicely imports as MBOX btw!)

How can I train Scribe on those emails?
In Training-mode - does it move suspected mails
or do I have to switch to live-mode? How soon
can I expect it to filter spam properly?
06/11/2003 9:07am
... also - where is the wordlist stored? Can I check if the training is working somehow?
06/11/2003 5:22pm
The word lists are stored as:
  • ham.wdb
  • spam.wdb
  • whitelist.wdb
    in the same directory as Scribe. They are just text files so openning them in a text editor will verify if they are being populated correctly.

    I'll add a warning on setting the probably folder to a non-existant folder. Same for a missing /Spam folder.

    If your 4000 spam is in the /Spam folder. All you have to do is run the Filters->Rebuild Word Lists command and the filtering should start working straight away.
  • DiCeR
    07/11/2003 5:07am
    Thank you for your reply.

    Regarding the training of i.scribe (Win32/Test44)
    Following your instructions yeilds no visible result.

    HAMWORDS is populated by outgoing mails I send.
    WHITELIST grows as I add filters, and rebuild "regular" folders...

    ... but SPAMWORDS remains at 34 bytes, before and after training, rebuilding, and also when I individually tag SPAM.

    I suspected it could have something to do with all my spam being imported. But checking an actual account where I receive quite a lot of spam daily, gave the same results.

    It seems that Bayesian functionality is broken in Test44.

    Apart from that - kickass program!!! :D I've been looking for something tiny and portable, and this seems to be it!

    Bravo! :)