[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.BSO.4.64.0609141708330.5350@funky.monkey.org>
Date: Thu, 14 Sep 2006 17:16:21 -0400 (EDT)
From: Jose Nazario <jose@...key.org>
To: "Dave \"No, not that one\" Korn" <davek_throwaway@...mail.com>
Cc: botnets@...testar.linuxbox.org, full-disclosure@...ts.grok.org.uk
Subject: Re: [botnets] the world of botnets article and
wrong numbers
On Thu, 14 Sep 2006, Dave "No, not that one" Korn wrote:
> Can you go into detail about the methodology you're using here? How do
> you "get to a number" of 15,000 from a number "between 200 and 800"?
> Is this a statistical extrapolation, or are you saying that your
> honeynet gets 200 to 800 unique samples a month, and so does that one
> over there, and that one, and that one.... and they all add up to 15000?
> Do you attempt to correct for variants that are simply re-packed using a
> different compressor, or other trivial changes? Do you attempt to
> correct for complex polymorphic variants?
my numbers are based on unique MD5 values.
the bulk of those are minor variants on a theme, ie repackaged bots or
reconfigured bots, maybe a new module thrown in or something. only a small
handful, maybe a dozen or so, are really new bots every month. very rarely
do we see new bots or new capabilities added. the last major change was
the use of the MS06-040 netapi exploit.
the bulk of the bot binaries i see are derivatives of well known families.
very few new families emerge in any given timeframe, but in the HTTP bot
world, we're starting to see people develop tools and reuse them.
unique bot samples, ~12-15k or higher a month. many independent teams can
back that ballpark figure up. new bot samples, truly new like i outlined
above, is far less. about three orders of magnitude less.
by the way, in this day and age the bulk of people do not bother with
polymorphism. they achieve it not through the classic - and elegant -
methods of self modifying code but instead by churning out new bots fast
and furious. same end result, though: confuse the naive, static detection
tools out thare.
> Some kind of explanation for the huge disjunction between these numbers
> and our instinctive ideas about what's possible. Of course, being
> un-worked-out intuitive estimates, such ideas are of course entirely
> likely to be off the mark, but off the mark by two orders of magnitude?
> Hence the request for more methodological details.
i guess i'm curious about your position, then, and what you're meaning by
"our instinctive ideas about what's possible".
it sounds like we're on the same page, but you may feel it's hyping the
problem to talk about new bots based on unique MD5 values. it's not my
favorite way of thinking about it, but it is easily underscored by a
real-world fact: many AV vendors fail to detect the same bot source simply
repackaged or re-configured (ie a new IRC server, everything else the
same). hence, each new MD5 means a new detection hit for them. so, hype
has a real-world backing, namely AV detection issues.
________
jose nazario, ph.d. jose@...key.org
http://monkey.org/~jose/ http://monkey.org/~jose/secnews.html
http://www.wormblog.com/
_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/
Powered by blists - more mailing lists