[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140227181640.GA12810@openwall.com>
Date: Thu, 27 Feb 2014 22:16:40 +0400
From: Solar Designer <solar@...nwall.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] GPU multiplication speed?
Bill,
On Thu, Feb 27, 2014 at 06:32:09PM +0400, Solar Designer wrote:
> On Thu, Feb 27, 2014 at 09:13:52AM -0500, Bill Cox wrote:
> > However, SHA-256, Blake2, and other hash
> > functions have a lot of parallelism, so a GPU can interleave
> > instructions that don't act on results from the prior few
> > instructions, reducing the latency impact.
>
> As far as I'm aware, no current code takes advantage of this as it's
> incompatible with OpenCL and CUDA programming models, and as GPUs lack
> ability to issue more than one instruction per cycle from the same
> hardware thread (unlike CPUs). Also, the parallelism available in those
> algorithms is insufficient to make it worthwhile to introduce any kind
> of data dependencies across threads.
Oh, I just realized you were referring to hiding the latencies, not
multi-issue. You're correct, whatever (little) parallelism is available
within e.g. SHA-256 would help (a bit) if we were somehow too limited in
the number of instances of SHA-256 that we could run concurrently.
Alexander
Powered by blists - more mailing lists