lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 17 Apr 2015 03:11:32 +0100
From: Samuel Neves <>
Subject: Re: [PHC] winner selection

On 04/14/2015 01:32 PM, Thomas Pornin wrote:
> If we are talking about Makwa here, then I would like to make the
> (possibly bold) claim that Makwa, without considering delegation, is an
> acceptable replacement of bcrypt. More precisely, it has been reported
> that using GPU is not worth it when trying to brute-force a bcrypt
> hash; general-purpose CPU with a few kilobytes of RAM are a better
> bargain for that job. My claim is that the same holds for Makwa.

I think this might be too bold of an assertion. The main obstacles in implementing modular arithmetic in a GPU are the
smaller multipliers and the small amount of fast memory per thread. The latter has considerably improved since I worked
on GPU arithmetic, and since we only need squaring for Makwa, is simpler than generic exponentiation.

The smaller multipliers are of course a handicap versus general-purpose GPUs. Current GPUs are able to perform a
double-precision FMA at each cycle; this translates to something like 2-3 theoretical TFLOPS, which can get fairly
competitive with CPUs. This recent paper [1], which I admittedly cannot understand, claims big advantages versus a CPU;
a second paper [2] claims 7210 2048-bit modular exponentiations per second on an NVIDIA GTX 580, which may translate
roughly to 14+ million modular multiplications per second. I would expect that Makwa's squaring, without the need for
square-and-multiply or windowed exponentiation, can be made somewhat faster. [3] uses floating-point arithmetic instead
and claims speedups (110 million modular multiplications/s), but they don't go above 1024-bit modular arithmetic, making
performance hard to compare accurately.

So I would reluctantly suggest that Makwa, while certainly not blazingly fast in GPUs---not like MD5 or SHA-{1,2}---is
not directly comparable to bcrypt. Additionally, a stronger integer multiplier would make a big difference here, and
this seems like a more plausible inclusion in future GPUs than some mechanism that improves bcrypt performance


Powered by blists - more mailing lists