phc-discussions - Re: [PHC] GPU multiplication speed?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140227181640.GA12810@openwall.com>
Date: Thu, 27 Feb 2014 22:16:40 +0400
From: Solar Designer <solar@...nwall.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] GPU multiplication speed?

Bill,

On Thu, Feb 27, 2014 at 06:32:09PM +0400, Solar Designer wrote:
> On Thu, Feb 27, 2014 at 09:13:52AM -0500, Bill Cox wrote:
> > However, SHA-256, Blake2, and other hash
> > functions have a lot of parallelism, so a GPU can interleave
> > instructions that don't act on results from the prior few
> > instructions, reducing the latency impact.
> 
> As far as I'm aware, no current code takes advantage of this as it's
> incompatible with OpenCL and CUDA programming models, and as GPUs lack
> ability to issue more than one instruction per cycle from the same
> hardware thread (unlike CPUs).  Also, the parallelism available in those
> algorithms is insufficient to make it worthwhile to introduce any kind
> of data dependencies across threads.

Oh, I just realized you were referring to hiding the latencies, not
multi-issue.  You're correct, whatever (little) parallelism is available
within e.g. SHA-256 would help (a bit) if we were somehow too limited in
the number of instances of SHA-256 that we could run concurrently.

Alexander