full-disclosure - weird worm ?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

From: jonathan at nuclearelephant.com (Jonathan A. Zdziarski)
Subject: weird worm ?

> >For days now, I've been receiving weird messages, with a few lines of
> >apparently random, garbage text, like this:
> >
> >highest bailiff nomad father advise heir
> >oxygen honorarium allegro reveal wronskian indentation coachmen
> >deficient tribute arcturus mitigate bypath
> >
> >

Bayesian "noise" designed to circumvent spam filters.  Not very
affective against any good statistical-type filter though.  The
spammer's philosophy is that if you can offset the guilty tokens in an
email with enough innocent tokens, you can get the spam through.  Since
most good bayesian filters don't pay particular attention to unknown
words, this approach rarely works.  The spammer has to hit a series of
tokens that exist in the user's corpus as "very innocent" tokens in
order for the message to have a chance at passing inspection...very
innocent meaning as innocent as their spammy tokens are guilty.  It is
somewhat possible to target this for a small handful of users, but since
statistical filters are learned for each specific user's behavior,
spammers are unable to get their message through to any large audience
behind such filters.  When they do get through, the innocent tokens that
were hit gradually lose their innocent status (heading more towards
neutral) and end up being ignored for more useful data in future scans.

On top of this, spammers that reuse would-be "innocent" tokens end up
with their messages inevitably sticking out like a sore thumb because of
these very tokens.  For example, how many people receive a lot of
innocent mail with the word "wronskian" in it?

I get about 200 spams a day in my quarantine, and probably 1/4 of these
now attempted to use Bayesian noise.  It used to be jumbled characters,
though spammers quickly realized it was doing them more harm then good. 
Now it's dictionary-type noise.

On top of this, by sending "weird" messages out (that is, messages that
have these tokens but not really any spam), users of these types of
filters are more likely to just delete the message in confusion, thereby
letting the spammer poison their wordlists with unique tokens the
spammer can use to send spams from later.  Smart users will mark/forward
these messages as spam also.

The strange key markers you see such as RAND_IMG are a good giveaway
that these messages are from a spammer.  These tokens are supposed to be
interpolated by the spammer's software to generate unique messages for
each user, but i've seen a lot of messages where the script failed and
send such tokens.

It's possible a spammer could be using a worm to generate and send out
these emails (especially if there's a significant rise in their
appearance on the Internet).  It's possible too that it could just be
another spam campaign.

Jonathan