phc-discussions - Re: [PHC] Comparing speed of entries

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date: Mon, 2 Mar 2015 06:10:25 -0800
From: Bill Cox <waywardgeek@...il.com>
To: "discussions@...sword-hashing.net" <discussions@...sword-hashing.net>
Subject: Re: [PHC] Comparing speed of entries

On Sun, Mar 1, 2015 at 9:15 PM, Ben Harris <ben@...rr.is> wrote:

> On 2 March 2015 at 12:30, Bill Cox <waywardgeek@...il.com> wrote:
>
>>
>> There seems to be some confusion about how to compare the speed of
>> entries in the PHC.  IMO, we should:
>>
>> - Compare entries when using the _same_ hash function, the same number of
>> rounds, and the same number of threads.
>>
>> With this simple rule, it is easy to see that Yescrypt is the fastest,
>> Lyra2 is the second fastest.
>>
>>
> The comparisons are more difficult for the entries that aren't 'sequential
> memory hard', e.g. Catena's lambda variable. I'd like to see a
> normalisation of these entries to be the number of invocations of the hash
> primitive. You then have two independent variables 'memory' and 'hash
> invocations' and the dependent variable is 'time' or 'bandwidth'.
>
> Also looking at the numbers for the <= L1 and <= L2 memory sizes (for the
> benchmarking machine).
>
>
The comparisons are difficult, but simpler with the same hashing.  Catena
has no method for dealing with external DRAM cache miss penalties, which
will dominate it's runtime unless they use the same size memory blocks.
Hashing about 4 or 8 KiB at once solves this problem, so you can bury it in
the hash function.

With this method of comparing, Yescrypt is slightly faster than Lyra2 in
computation bound processing, but significantly faster when I/O bound.
Catena suffers from the 2-3X memory*time penalty that are expected from all
cache-timing-resistant algorithms when compared to Yescrypt.

Please excuse and correct mistakes below.  It has been a while since I
carefully studied this:

With the same block-hash functions, Catena-3 does 1 hash and write of all
RAM in the first "row", one read/write and one hash of all RAM in the
second row (assuming the two blocks are XORed or something to reduce the
runtime), and one read and one hash of all ram in the third row.  This is a
total of 4 read/write operations and 3 block hashes per memory block.  This
seems to be about what is required for a cache-timing resistant algorithm
to be secure enough against TMTO attacks, so this is not a mark against
Catena, IMO, but a mark against the whole class of algorithms.  I have not
studied their newer algorithm.

Lyra2 does a read/write and a block hash in the first loop, and another
block hash and 2 reads and 2 writes in the second wandering phase.  When
hash-computation bound, it's runtime is basically twice the number of
memory blocks, just like Scrypt.

Yescrypt wit t_cost == 1 (my preferred setting), initializes memory like
Scrypt doing 1 write and 1 block hash.  In the second loop, he hashes
2/3rds of memory with two reads and one write.  This is a total of 5/3rds
hashes per memory block, lower than any other remaining entry.  My old
TwoCats did 1 :-)

While Lyra2 has stronger TMTO defense it will be slower as a result,
lowering memory*time defense.  It is not a clear-cut issue.  With t_cost ==
1, I can do a 1/2 memory attack against Yescrypt with no change to
memory*time cost, which may help some attackers.  However, especially when
running in external DRAM and 2 or more threads (how it _should_ be run for
disk encryption key derivation), memory bandwidth is the limiting factor.
In this case, Yescrypt's lead becomes significant, with an average of 3 I/O
operations per memory location, compared to Lyra2's 6  This should lead to
a 2X advantage in Yescrypt's peak memory*time cost.  My old entry had 2
read/write operations per memory block :-)  However, in Scrypt compatible
mode, Yescrypt does only 2 read/write operations per block, and is
equivalent in speed to my old algorithm, though mine had better TMTO
resistance in this mode.

Other factors do matter, which is why I prefer Yescrypt over all the other
entries, even my old TwoCats.  Yescrypt's defense against bot-nets,
improved GPU defense, and computational hardness are among my favorite
features of Yescrypt.  However, speed does still matter, which is another
reason I prefer Yescrypt over the remaining entries.  However, Lyra2 is an
outstanding algorithm, and a bit simpler, so I can see why some people
might prefer it, even if it does not provide the strongest possible defense
in as many scenarios.

Bill

Content of type "text/html" skipped