bugtraq - Re: Is predictable spam filtering a vulnerability?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <40D505EA.40404@immunix.com>
Date: Sat, 19 Jun 2004 20:35:06 -0700
From: Crispin Cowan <crispin@...unix.com>
To: Andrew Hunter <andiroohunter@....com>
Cc: bugtraq@...urityfocus.com
Subject: Re: Is predictable spam filtering a vulnerability?


Andrew Hunter wrote:

> I think spam filters arn't the solution to the spam problem. If 
> someone gets 200 spam emails aday then what use is a spam filter 
> telling them the email was rejected? The user will end up not looking 
> at the list of rejected emails because it's sooo big.
>
> Filtering certain works is also bad aswell eg "penis", "viagra".
> It can easyly be evoided:
> Email "Free penis enlargement pills" - Would be filterd
> Email "Free pen is enlargement pills" - Wouldn't be filtered

Sounds like you don't have much experience with spam filters.

I use the Bayesian spam filter built into Mozilla. On average, I get:

    * 95 legitimate mails per day
    * 180 additional spam per day
    * about 10 spam per day end up in the legitimate box, and the
      remaining 170 are filtered into the Junk folder, for periodic
      inspection
    * about 1 legitimate mail per month is mis-classified and put in the
      Junk folder
          o zero of them are critical, as the spam filter automatically
            does not Junk anything from anyone in my address book

Note that spammers *do* use hacks like "Free pen is enlargement" (and a 
broad variety of other cute typos) and the Bayesian filter catches on to 
them very quickly.

> So in order to be effective it has to look for variations on the works
> For example "penis" it could look for "P E N I S", "peni$" etc...
>
> This is when the problems start. I get sent 200 spam emails the 
> rejected emails log is huge, i can't be bothed to look through it, 
> it'll take tooo long, but it has removed an important email.
>
> Email "Dear Andiroo, I have found your pen, it was under my desk. You 
> PEN IS now in the top draw of your desk".
>
> Ok i lost my sepcial pen, my friend has found it but look "PEN IS" is 
> like "PENIS" so it's been taken by the spam filter.

The Bayesian filter is not fooled by these issues either way. I can say 
"penis" in an e-mail and it will not get filtered, because the scoring 
system balances the total score.

> My solution for spam:
> I think there should be a huge database on spam emails, just like an 
> anti virus scanner but for spam. I think it is that simple have an 
> anti-virus but for spam, i am sure that if i get a spam email someone 
> else will have exactly the same email so if i can submit it to the 
> database and it's added to it quickly so everyone can get the updates 
> then there would be no problem, but there is soooo much spam out there 
> we would for ever have to update or ever growing in size databases.

That has been tried and failed a long time ago. The problem is that the 
spammers caught on to it quickly, and started adding random junk to 
their mails just so that no two of them would have the same checksum. 
That is why you see random junk characters in the headers and bodies of 
spams.

> I think this would eliminate alot of spam, I have ran out of ideas for 
> preventing spam emails, so what other effective solutions already out 
> there?

I think if the penalty for spamming was having your head mounted on a 
stick then there would be a lot less spam :)

Crispin

-- 
Crispin Cowan, Ph.D.  http://immunix.com/~crispin/
CTO, Immunix          http://immunix.com