lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1065552343.3151.81.camel@tantor.nuclearelephant.com>
From: jonathan at nuclearelephant.com (Jonathan A. Zdziarski)
Subject: Spam with PGP

Actually the way SA does it weakens filtering.  SA's bayesian filtering
is only a very small piece of SA, and unfortunately not much attention
has been given to it.  The filter's final calculation is only a small
percentage of the actual final score.  Because true Bayesian filtering
performs a huge majority of the same tests that SA performs, SA's own
ruleset easily waters down any bayesian findings whenever there are
opposing values between the two.  For example, a pine MUA...SA thinks a
pine MUA suggests an innocent message, but a majority of the emails with
a pine MUA my wife receives are spams.  In this case, the hard-coded MUA
rule will unfortunately water down the score, even if Bayes thinks a
pine MUA is spam.  Obviously the pine MUA is just a small rule, but if
you apply this to the other rules, you get the same results.  

What's worse is that last time I looked (this may have changed), SA's
bayesian filter did not appear to have a mechanism for learning, but was
just a static dictionary.  If users got spam there was no way for the
user to forward their spams into the system for processing.  Again, this
may have changed and if it has, that's great.

The product of Bayesian filtering includes all the heuristic tests as
well, so having both _hurts_ you, and is not something you benefit
from.  It is much better to focus on creating a strong probability-based
filter IMHO...and I think the statistics agree with me.

> Of course, SpamAssassin does bayesian filtering as well.
> 
> heuristic + bayesian is better than either alone, IMHO.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ