lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f26cd0911001050708m4832c0f9v8f5f13a5a57a5f15@mail.gmail.com>
Date: Tue, 5 Jan 2010 16:08:44 +0100
From: Dan Kaminsky <dan@...para.com>
To: T Biehn <tbiehn@...il.com>
Cc: Full Disclosure <full-disclosure@...ts.grok.org.uk>,
	bugtraq@...urityfocus.com, Joxean Koret <joxeankoret@...oo.es>
Subject: Re: [Tool] DeepToad 1.1.0

Joxean's stuff is similar to Nilsimsa or (as he mentions) ssdeep, in that
it'll find mostly similar instances of the same underlying data, assuming
only small bit-level changes (such as from version shifts).  It's obviously
not a magic unpacker of any arbitrary virus, though.

His stuff, by its very nature, is a fuzzy similarity metric, meaning if you
run it on small chunks of a file sequentially you can get fuzzy diff.

Detecting multiple files of the same file type is actually a different
problem, and sort of an interesting one.  The best thing to do here is take
a large number of samples that *are* your file type, and then a large number
of samples that *are not* your file type (and are not the same other
not-the-right-type), and look for either strings or statistical patterns
that show up in the member set and not in the alternate.  These fingerprints
are then sought in other samples.

It's not terribly common that you actually need to do this though.  Browsers
need to do this a bit because MIME types are wonky.  They do this
optimization by hand though.


On Tue, Jan 5, 2010 at 3:56 PM, T Biehn <tbiehn@...il.com> wrote:

> I can see what you're saying, it could be useful for finding
> differences in different versions of the same binary but from what I
> can see Joxean's app is meant to group files of the same 'type,' not
> provide 'diff' capabilities.
>
> -Travis
>
> On Tue, Jan 5, 2010 at 9:51 AM, Dan Kaminsky <dan@...para.com> wrote:
> > I looked into a fair amount of this sort of normalization back when I was
> > playing with dotplots.  The idea was to upgrade from simple Levenshtein
> > string comparison (with no knowledge of variable length x86 instructions,
> > pointers that shift from compile to compile, etc) to something with at
> least
> > some domain specific knowledge.  What I found, somewhat surprisingly, was
> > that dumb string comparison was more than enough.  In fact, when I
> compared
> > pre-patch and post-patch builds, it was easy to directly see when content
> > was added, removed, shifted in location, etc.  Joxean's going to have
> much
> > the same result -- as basic as his similarity metric is, he'll get the
> broad
> > strokes just fine.
> >
> > Ultimately the best approach is to build a graph of how functions
> interact
> > and measure graph isomorphism, but of course Halvar figured that out
> years
> > ago :)
> >
> > On Tue, Jan 5, 2010 at 3:41 PM, T Biehn <tbiehn@...il.com> wrote:
> >>
> >> Hmm,
> >> Wouldn't it be more useful to the sec community to have a algorithm
> >> that abstracts at the -interpreted- content level? That is when
> >> analyzing binaries I wouldn't think that this would classify two with
> >> near identical functionality together, even though it is removing a
> >> significant chunk of information during the hash pass.
> >>
> >> I would largely assume that your algorithm, as is, works best on
> >> uncompressed bitmaps. Is there something I'm missing?
> >>
> >> -Travis
> >>
> >> On Sun, Jan 3, 2010 at 6:37 AM, Joxean Koret <joxeankoret@...oo.es>
> wrote:
> >> > Hi all,
> >> >
> >> > I'm happy to announce the very first public release of the open source
> >> > project DeepToad, a tool for computing fuzzy hashes from files.
> >> >
> >> > DeepToad can generate signatures, clusterize files and/or directories
> >> > and compare them. It's inspired in the very good tool ssdeep [1] and,
> in
> >> > fact, both projects are very similar.
> >> >
> >> > The complete project is written in pure python and is distributed
> under
> >> > the LGPL license [2].
> >> >
> >> > Links:
> >> > Project's Web Page http://code.google.com/p/deeptoad/
> >> > Download Web Page http://code.google.com/p/deeptoad/downloads/list
> >> > Wiki http://code.google.com/p/deeptoad/w/list
> >> >
> >> > References:
> >> > [1] http://ssdeep.sourceforge.net/
> >> > [2] http://www.gnu.org/licenses/lgpl.html
> >> >
> >> > Regards && Happy new year!
> >> > Joxean Koret
> >> >
> >> >
> >> > _______________________________________________
> >> > Full-Disclosure - We believe in it.
> >> > Charter: http://lists.grok.org.uk/full-disclosure-charter.html
> >> > Hosted and sponsored by Secunia - http://secunia.com/
> >> >
> >>
> >>
> >>
> >> --
> >> FD1D E574 6CAB 2FAF 2921  F22E B8B7 9D0D 99FF A73C
> >>
> http://pgp.mit.edu:11371/pks/lookup?search=tbiehn&op=index&fingerprint=on
> >> http://pastebin.com/f6fd606da
> >>
> >> _______________________________________________
> >> Full-Disclosure - We believe in it.
> >> Charter: http://lists.grok.org.uk/full-disclosure-charter.html
> >> Hosted and sponsored by Secunia - http://secunia.com/
> >
> >
>
>
>
> --
> FD1D E574 6CAB 2FAF 2921  F22E B8B7 9D0D 99FF A73C
> http://pgp.mit.edu:11371/pks/lookup?search=tbiehn&op=index&fingerprint=on
> http://pastebin.com/f6fd606da
>

Content of type "text/html" skipped

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ