lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 15 Jan 2014 23:08:41 -0500
From: Bill Cox <waywardgeek@...il.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] Using multiply (Re: [PHC] A must read...)

Er... yes?  I am thinking about it, but I may be making mistakes.  I'm
running 2 threads on my dev machine, but I think 4 SSE instructions in
parallel may be better, and I could do that with interleaving the
memory.  Until I get some experience with SSE, all I can do is guess.
I'm guessing I may be able to double my memory bandwidth, but I really
don't know.

Bill

On Wed, Jan 15, 2014 at 10:02 PM, Steve Thomas <steve@...tu.com> wrote:
>> On January 15, 2014 at 11:45 AM Bill Cox <waywardgeek@...il.com> wrote:
>>
>> I like the idea of floating point, but I doubt it's worth the excess
>> trouble we'll run into. 32x32 -> 32 Integer multiply seems solid and
>> pervasive enough. Besides that, it's fast in our devices even
>> compared to a custom ASIC, and it's a great operation for mixing bits,
>> at least when one op is odd.
>
> Have you thought about doing 4 (or more) multiplies in parallel:
> 4 multiplies with SSE4.1 (PMULLD _mm_mullo_epi32)
> 8 multiplies with AVX2 (VPMULLD _mm256_mullo_epi32)
> 16 multiplies with AVX-512 (VPMULLD _mm512_mullo_epi32)
>
> You can reorder the values in any order in SSE2 with PSHUFD
> (_mm_shuffle_epi32). Reordering the values in AVX2 and AVX-512 is
> trickier and may need multiple instructions.
>
> SSE4.1 has been on pretty much every Intel CPU since 2008. AVX2
> just came out last year with Haswell. I think integer AVX-512 will be on
> Skylake in 2015. They could delay integer operations until the next
> iteration in 2017 like they did with AVX/AVX2. AVX-512 should probably
> be considered since this competition will end when AVX-512 is
> estimated to be available or on the horizon.

Powered by blists - more mailing lists