Spamassassin: BAYES_00=-1.90 although sa-learn runs daily

I recently noticed that my spamassassin bayes filter doesn’t seem to work. And this is even though I run a script, that uses sa-learn to learn ham and spam tokens daily. Looking at the header of SPAM mail, I often saw something like:

X-Spam-Status: No, score=1.411 tagged_above=1 required=4.5
    	tests=[BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001,
    	FORGED_YAHOO_RCVD=1.63, FREEMAIL_FROM=0.001, NML_ADSP_CUSTOM_MED=0.9,
    	SPF_NEUTRAL=0.779] autolearn=no

As you can see my BAYES score looked like the bayes filter was completely untrained in my case. How can this happen although I have thousands of learned hams and spams in my database?

Well, for me there was an easy explanation…

My sa-learn-script runs via cron job in root context and learns into the database located at /root/.spamassassin/. But spamassassin is called in the context of amavis which uses /var/lib/amavis/.spamassassin/.

You can check this easily:

sudo sa-learn -D --dump magic
[...]
Okt  2 13:10:32.216 [17743] dbg: config: using "/root/.spamassassin" for user state dir
Okt  2 13:10:32.216 [17743] dbg: bayes: tie-ing to DB file R/O /root/.spamassassin/bayes_toks
Okt  2 13:10:32.216 [17743] dbg: bayes: tie-ing to DB file R/O /root/.spamassassin/bayes_seen
Okt  2 13:10:32.217 [17743] dbg: bayes: found bayes db version 3
[...]

vs.

su amavis -c "sa-learn -D --dump magic"
[...]
Okt  2 14:26:22.931 [20051] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x3870488) implements 'learner_is_scan_available', priority 0
Okt  2 14:26:22.931 [20051] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_toks
Okt  2 14:26:22.932 [20051] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_seen
Okt  2 14:26:22.932 [20051] dbg: bayes: found bayes db version 3
[...]

Okay, know that we know what is wrong, how do we fix it?

 

1. Export the learned data from the root context bayes database

sudo sa-learn --backup >  /tmp/bayes-backup.txt
sudo chown amavis.amavis /tmp/bayes-backup.txt

 

2. Backup your amavis related bayes database (just in case something wents wrong and you have to roll back)

su amavis -c "cp -R /var/lib/amavis/.spamassassin /var/lib/amavis/.spamassassin_bkp152001"

 

3. Import the learned data to the amavis context bayes database

su amavis -c "sa-learn --restore /tmp/bayes-backup.txt"

 

4. Make sure sa-learn uses amavis context from now on

For this, edit your /etc/spamassassin/local.cf and add the following line:

[...]
bayes_path /var/lib/amavis/.spamassassin/bayes
[...]

Note that, although it says “path” the parameter contains a filename prefix (bayes).

 

5. Check if sa-learn REALLY uses the amavis context database location

sudo sa-learn -D --dump magic
[...]
Okt  2 15:24:16.614 [21749] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x387c208) implements 'learner_is_scan_available', priority 0
Okt  2 15:24:16.615 [21749] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_toks
Okt  2 15:24:16.615 [21749] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_seen
Okt  2 15:24:16.616 [21749] dbg: bayes: found bayes db version 3
[...]

If you see this, you were successful. All spamassassin bayes related action from now on takes place in the amavis context.

Tagged with: , , , , , , ,
One comment on “Spamassassin: BAYES_00=-1.90 although sa-learn runs daily

Leave a Reply

Your email address will not be published. Required fields are marked *

*