lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 19 Mar 2014 21:39:32 -0500 (CDT)
From: Steve Thomas <>
Subject: Re: [PHC] Supporting AVX2/SSE2 or not with a single binary

> On March 19, 2014 at 8:09 PM Andy Lutomirski <> wrote:
> On Wed, Mar 19, 2014 at 5:54 PM, Bill Cox <> wrote:
> > One reason I think we see applications running without SSE/AVX2
> > support is that operating systems don't want to support two versions
> > of a binary, and they have to support older machines. The Blake2 code
> > I've read does not provide for a single binary that supports both - I
> > have to link either to the blake2-ref code or the blake2-sse code. My
> > TwoCats code has inherited this limitation, since I used the Blake2
> > code as a roadmap for figuring out how SSE2 works. These guys over on
> > StackOverflow think they've got code to detect SIMD support and allow
> > a single binary to support both:
> >
> >
> That answer is crap. You cannot detect whether AVX is usable using
> just cpuid -- you need to use (IIRC) xgetbv as well.


> On gcc 4.8 and up, function multiversioning [1] is probably the way to
> go. On Windows (if you want to support MSVC), you'll need to do
> something different. Of course, function multiversioning has the same
> bug [2] and no one has fixed it yet.
> [1]
> [2]

Disagree. That's cool and all but worthless: only Linux and buggy.


Bill don't try to learn SS*E*/AVX* from the blake2-ref code... besides being
buggy. It is really hard to read. For me it was actually easier to read code
( in a language I don't know than to read
that. I was trying to figure out how Blake2b worked and with the combination
of bad doc and bad code it was "impossible" (not worth a few hours). Until
finding a simple  implementation in Go.

The way I' ve been dealing with write "once" compile twice (Linux/Windows)
run best code "everywhere" is:
Specifically:  src/common.cpp (getInstructionSets()),  src/hashfactory.h, and
src\hash\*. Note there are better ways to do this. This was just one I did
awhile ago and I am still like well I could change that and it would be better.

Also this is for a different purpose  hash cracking with rainbow tables but
can be used with hash cracking in general and parallel  hashing. These
implementations are all single block and are severely  length limited.  Oh
right it has support for detecting AVX2 but no code is written for it.  Although
it would be super easy to add it (I just wanted to test it first but at the time
don't think I knew about Intel's emulator and obviously  AVX2 was not out yet
[this was all written in late 2011 and early  2012]).  One way to make this
better is using templates and static inline class  functions to do the minor
differences in  architectures .
Content of type "text/html" skipped

Powered by blists - more mailing lists