Thread

Index > Scribe > Bayesian Filtering
Author/Date Bayesian Filtering
TheMaskedMan
10/08/2005 11:21am
Hi,

I've not been able to get the bayesian filtering to work. I've had it set to both training mode and live mode, but neither of them seems to work. I've classified 258 messages as spam, but the same ones just keep popping up. Any advice would be appreciated.
fret
10/08/2005 12:14pm
Is there some files with the extension ".wdb" in your scribe folder? (How big are they?)

Are you using the -o option to set the options file path?

TheMaskedMan
10/08/2005 2:00pm
There is a spamwords.wdb and it is 1k. I'm not using any command-line options
TheMaskedMan
11/08/2005 3:26pm
It never marks anything as spam, no matter if I send it to the spam folder. It tells me there is lots of ham, but no spam.
fret
11/08/2005 10:05pm
Run the "Filters->Rebuild Bayesian Word Lists" command and see if the wdb files are bigger than that. They should be larger than 1 kb.

Then once you've done that it should start to work if it's in live mode.

If you receive a spam after that then click "Filters->Analyse selected mail" and see if some of the words have a value near 1 (e.g. 0.8967), that means it's a spammy word. If the overall value down the bottom is like 0.7 then the bayesian filter is working but the mail didn't go over the threshold needed to mark it as spam (which is 0.9).
TheMaskedMan
12/08/2005 11:35am
Rebuilt and my filesizes are:

hamwords.wdb 447kb
spamwords.wdb 1kb
whitelist.wdg 21.kb

I have 269 messages in my spam folder, none of which ever get analyzed.

Any ideas?
fret
16/08/2005 11:33pm
Thats weird. Which version are you running?
TheMaskedMan
17/08/2005 11:52am
1.88 Test10
TheMaskedMan
25/08/2005 12:14pm
Same with Test11
fret
25/08/2005 12:57pm
I'll have to get you to run a debug build. I havn't set that up yet.
30Mil
01/09/2005 1:00pm
Just to add my observation:
I had let about 120 spams (that I moved myself) build up in the spam folder while basian was set to learning.
After I got a decent amount built up I switched to live. I had to rebuild the word list to get any wdb files at all. But only after I read this post did I know to do that.
Hopefully it will start identifying spam now.
TheMaskedMan
09/09/2005 2:20pm
Hi fret, any word on a debug build? It still doesn't work for me. Thanks.
fret
12/09/2005 1:06am
Just sent a build to you.
TheMaskedMan
12/09/2005 12:13pm
To what address?
fret
12/09/2005 8:40pm
Your gmail address that you've used here on the forums... themaskedman [at ] gmail [dot] com
TheMaskedMan
12/09/2005 11:31pm
Oh, I see. It didn't make it through then. It's not in my Inbox or junk mail. You might try sending it to tmm@themaskedman.net. Thanks :)
fret
13/09/2005 1:42am
Ok I've sent something to that address as well.
Reply