full-disclosure - Re: [botnets] the world of botnets article and wrong numbers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.BSO.4.64.0609141708330.5350@funky.monkey.org>
Date: Thu, 14 Sep 2006 17:16:21 -0400 (EDT)
From: Jose Nazario <jose@...key.org>
To: "Dave \"No, not that one\" Korn" <davek_throwaway@...mail.com>
Cc: botnets@...testar.linuxbox.org, full-disclosure@...ts.grok.org.uk
Subject: Re: [botnets] the world of botnets article and
 wrong numbers

On Thu, 14 Sep 2006, Dave "No, not that one" Korn wrote:

> Can you go into detail about the methodology you're using here?  How do 
> you "get to a number" of 15,000 from a number "between 200 and 800"? 
> Is this a statistical extrapolation, or are you saying that your 
> honeynet gets 200 to 800 unique samples a month, and so does that one 
> over there, and that one, and that one.... and they all add up to 15000? 
> Do you attempt to correct for variants that are simply re-packed using a 
> different compressor, or other trivial changes?  Do you attempt to 
> correct for complex polymorphic variants?

my numbers are based on unique MD5 values.

the bulk of those are minor variants on a theme, ie repackaged bots or 
reconfigured bots, maybe a new module thrown in or something. only a small 
handful, maybe a dozen or so, are really new bots every month. very rarely 
do we see new bots or new capabilities added. the last major change was 
the use of the MS06-040 netapi exploit.

the bulk of the bot binaries i see are derivatives of well known families. 
very few new families emerge in any given timeframe, but in the HTTP bot
world, we're starting to see people develop tools and reuse them.

unique bot samples, ~12-15k or higher a month. many independent teams can 
back that ballpark figure up. new bot samples, truly new like i outlined 
above, is far less. about three orders of magnitude less.

by the way, in this day and age the bulk of people do not bother with 
polymorphism. they achieve it not through the classic - and elegant - 
methods of self modifying code but instead by churning out new bots fast 
and furious. same end result, though: confuse the naive, static detection 
tools out thare.

> Some kind of explanation for the huge disjunction between these numbers 
> and our instinctive ideas about what's possible.  Of course, being 
> un-worked-out intuitive estimates, such ideas are of course entirely 
> likely to be off the mark, but off the mark by two orders of magnitude? 
> Hence the request for more methodological details.

i guess i'm curious about your position, then, and what you're meaning by 
"our instinctive ideas about what's possible".

it sounds like we're on the same page, but you may feel it's hyping the 
problem to talk about new bots based on unique MD5 values. it's not my 
favorite way of thinking about it, but it is easily underscored by a 
real-world fact: many AV vendors fail to detect the same bot source simply 
repackaged or re-configured (ie a new IRC server, everything else the 
same). hence, each new MD5 means a new detection hit for them. so, hype 
has a real-world backing, namely AV detection issues.

________
jose nazario, ph.d.		    jose@...key.org
http://monkey.org/~jose/ 	    http://monkey.org/~jose/secnews.html
 				    http://www.wormblog.com/

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/