lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 9 Sep 2014 06:18:24 +0400
From: Solar Designer <>
Subject: Re: [PHC] A review per day - Lyra2

> On 09/08/2014 11:00 AM, Marcos Simplicio wrote:
> > Did you actually benchmarked all three algorithms on a GPU? Because
> > we did so with Lyra2

I did not.  It's great that you did.  I was wondering what the CUDA code
included with Lyra2 was for - whether it was to show how inefficient
Lyra2 is on GPUs, or to show that it's usable defensively even on GPUs.
Now it's clear that your intent was the former.

> > and, even without the additional small random
> > reads prescribed in section 6.2, the performance on our GPU was
> > ~200 times worse than in our CPU (we testes for memory usages
> > between 96 MB and 800 MB).

This range is currently almost irrelevant for testing of GPU attack
resistance.  It is clear that any sane scrypt-like algorithm will be very
slow on current GPUs, as compared to current CPUs, for per-hash memory
usage in this range.  This would be true for scrypt itself as well, but
we're trying to do better than scrypt.

A currently relevant per-hash memory cost setting is 2 MB for yescrypt,
and probably much less than that for Lyra2 (since it's slower at same
memory usage).  This is what you'd realistically achieve with mass user
authentication.  96 MB is currently unrealistic for that use case.
scrypt's recommended minimum of 16 MB is difficult to achieve as well -
sometimes it's possible, sometimes not (depending on the required
request rate capacity and exact choice of server hardware).

(BTW, it's similar for cryptocoin use as well, if someone wants a
GPU-unfriendly cryptocoin.)

On Mon, Sep 08, 2014 at 08:28:31PM -0400, Bill Cox wrote:
> Scrypt achieves parity with GPUs at about 4MiB hashes, according to
> Alexander.

This has since moved to 16 MiB, for more optimal code on some NVIDIA GPUs:

... or possibly even 32 MiB, if you compare against cheaper CPUs
(quad-core rather than 8-core).

This relies heavily on scrypt's efficient TMTO, so Lyra2 got to work
much better than that... but not necessarily well enough at whatever
memory usage it can achieve for thousands of requests per second on a
CPU.  For yescrypt, 10k hashes/s on a 16-core server is achieved at 2 MiB.
For Lyra2, the same might be achieved at something like 512 KiB, I
guess?  (I'd be happy to provide access to our server for such testing.)

> Finding the parity point would be an interesting test.


Would, say, 512 KiB for Lyra2 be below or above the parity point?

> Frankly, such small memory hashes are only warranted when servers are
> doing a ton of authentications per second.  Alexander likes to say
> 100's to 1000's.
> Small memory hashes all the way down to just a few KiB apparently are
> commonly needed, but I honestly do not know what for!  Hashing just a
> few KiB seems entirely counter to "memory-hard" to me.  However, this
> is where small unpredictable memory apparently provide good GPU defense.

My current target is up to 10k hashes/s on a currently reasonable
server, and this happens to correspond to 2 MiB for yescrypt.

Low memory cost settings, possibly below 2 MiB, may also be needed for
use as a crypt(3) on a Unix system, especially when we consider
occasional uses of that system in (possibly small) VMs (e.g., there are
cheap VPS offers starting with 128 MB RAM or so) or/and on old machines.

> I was trying to say Lyra2 and Yescrypt get t_cost usage right, and
> that Lyra2 has no setting that allows for fewer than 9 read/write
> operations per memory location, when TwoCats does 2 by default, and
> Yescrypt runs with 2.25 with t_cost == 0 (which I prefer).

I think it's 8/3 = ~2.67 for yescrypt with p=1, and closer to 7/3 = ~2.33
for high p (assuming YESCRYPT_PARALLEL_SMIX).


Powered by blists - more mailing lists