[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20040112205826.CE2CA43224@maja.zesoi.fer.hr>
From: Bojan.Zdrnja at LSS.hr (Bojan Zdrnja)
Subject: spam with anti-bayesian parts
> -----Original Message-----
> From: full-disclosure-admin@...ts.netsys.com
> [mailto:full-disclosure-admin@...ts.netsys.com] On Behalf Of
> Suresh Ponnusami
> Sent: Tuesday, 13 January 2004 12:30 a.m.
> To: vogt@...senet.com; full-disclosure@...ts.netsys.com
> Subject: Re: [Full-Disclosure] spam with anti-bayesian parts
>
> Actually most of the spammers use automated tools that contains some
> scriptable plugins to evade the spam filters. Since they spam more that
> 1000's of users at a time, picking something real might be a bit slow and
> requires extra processing. Even if they create a template for all the
mails,
> that'll take up some time which they may not want to waste on. Also,
> introducing random gibberish noise might be able to get through bayesian
> filters because, that particular gibberish junk may not be in
> the database.
That shouldn't help them if spam marking software is written properly (and
many aren't). If something is not in the bayesian database, it will be given
a neutral value (like 0.5 probability of being spam). However, their
marketing words (viagra or whatever) will give a high probability (more than
0.9). As software should take only first 6 or so most spammy tokens into
play, all that gibberish won't matter.
One thing when that will help them is when they only include one IMG SRC in
HTML e-mail, and if that link wasn't before in the database. That way
Bayesian classifier won't have a change to know what's going on as possibly
everything could get neutral values. In that case, other rules should help
(like RDBs and so on).
For more info I'd suggest checking wonderful Paul Graham's Web page at:
http://www.paulgraham.com/antispam.html
Also, Jonathan's DSPAM has some nice documents at the Web (I'm pretty sure
he'll reply to this as well). You can find this at:
http://www.nuclearelephant.com/projects/dspam/
Cheers,
Bojan
Powered by blists - more mailing lists