[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20031007205158.GA11636@pirispons.net>
From: fulldisclosure at pirispons.net (Kiko Piris)
Subject: Spam with PGP
On 07/10/2003 at 14:45, Jonathan A. Zdziarski wrote:
> Actually the way SA does it weakens filtering. SA's bayesian filtering
> is only a very small piece of SA, and unfortunately not much attention
> has been given to it. The filter's final calculation is only a small
> percentage of the actual final score. Because true Bayesian filtering
> performs a huge majority of the same tests that SA performs, SA's own
> ruleset easily waters down any bayesian findings whenever there are
> opposing values between the two.
IMHO, bayesian filters are no panacea right now, many spams I get end
like this:
---8<---
</body></html>ahdmf uvhuex qnzysthoa
r
xdgmeqxqyawg
--->8---
And this nonsense "words" fool bayesian filters. And also do what Brian
Dinello pointed.
> For example, a pine MUA...SA thinks a pine MUA suggests an innocent
> message, but a majority of the emails with a pine MUA my wife receives
> are spams. In this case, the hard-coded MUA rule will unfortunately
> water down the score, even if Bayes thinks a pine MUA is spam.
> Obviously the pine MUA is just a small rule, but if you apply this to
> the other rules, you get the same results.
rules can be easyly deactivated or "reinforced" in
/etc/spamassassin/local.cf or ~/.spamassassin/user_prefs if defaults do
not suit your needs.
For example, right now, my SA (2.60-1 / debian sid) assigns no points to
mails having pine headers (if it did assign any point to it, it would be
very easy to configure not to do so).
> What's worse is that last time I looked (this may have changed), SA's
> bayesian filter did not appear to have a mechanism for learning, but was
> just a static dictionary. If users got spam there was no way for the
> user to forward their spams into the system for processing. Again, this
> may have changed and if it has, that's great.
It has it (sa-learn). And with mutt and it's macros, teaching SA from
its own errors is just a matter of a keypress.
> The product of Bayesian filtering includes all the heuristic tests as
> well, so having both _hurts_ you, and is not something you benefit
> from. It is much better to focus on creating a strong probability-based
> filter IMHO...and I think the statistics agree with me.
>
> > Of course, SpamAssassin does bayesian filtering as well.
> > heuristic + bayesian is better than either alone, IMHO.
I agree with this, rulebased+bayesian (SA) works better (at
least for me) than bayesian alone (bogofilter). However I must say that
bogofilter is the only bayesian filter I tried (and I uninstalled it
some months ago when I switched to SA).
As I said before, I think that bayesian filters are not perfect
(spammers use tricks to circumvent them). And I also think that
rulebased ones are'nt perfect too (there are also tricks to fool them,
like the pgp one pointed by who started this thread).
So I think that a combination of both is better.
Just my 2 cents...
Greetings to all
--
Kiko
Powered by blists - more mailing lists