phc-discussions - Re: [PHC] Comparing speed of entries

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150302154124.GA31407@openwall.com>
Date: Mon, 2 Mar 2015 18:41:24 +0300
From: Solar Designer <solar@...nwall.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] Comparing speed of entries

On Mon, Mar 02, 2015 at 06:10:25AM -0800, Bill Cox wrote:
> Please excuse and correct mistakes below.
[...]
> Yescrypt wit t_cost == 1 (my preferred setting),

Why not 0?

> initializes memory like Scrypt doing 1 write and 1 block hash.

With YESCRYPT_RW there are both reads and writes during the
initialization phase as well.  So 2 I/O's per block.

> In the second loop, he hashes
> 2/3rds of memory with two reads and one write.

No, with one read and one write.  And that's 2/3 with t = 1, but by
default I recommend t = 0, in which case it's just 1/3.  This is not
"2/3rds of memory", etc., though - but how many block numbers are
generated (some will be duplicate).

For YESCRYPT_RW, t = 0, p = 1 it's 8/3*N I/O's total.  So e.g. for
1 GB peak memory size, 2.67 GB of I/O would occur.

For TwoCats, I think it was 2*N I/O's, so 2 GB of I/O for 1 GB peak.

> This is a total of 5/3rds hashes per memory block,

For t = 1, yes.

> lower than any other remaining entry.  My old TwoCats did 1 :-)

Yes, but I think it makes more sense to count I/O's per memory block.
The "hashes" may be easily scaled one way or the other by adjusting the
round count of the primitive - well, except that we don't want to go
below 1 round.

> While Lyra2 has stronger TMTO defense it will be slower as a result,
> lowering memory*time defense.  It is not a clear-cut issue.  With t_cost ==
> 1, I can do a 1/2 memory attack against Yescrypt with no change to
> memory*time cost, which may help some attackers.

I guess you mean with t = 0, and I guess you're referring to the
area cost being just 2/3*N at 4/3*N time.  Yes, that's by design, and
with a different design it could at best be 5/6*N, which isn't that
much higher, by the same time point.  (In scrypt, it would have been
1/3*N at 4/3*N, except that scrypt can't be made to stop this early.
In yescrypt, this becomes the new optimal stop point.)

> However, especially when
> running in external DRAM and 2 or more threads (how it _should_ be run for
> disk encryption key derivation), memory bandwidth is the limiting factor.
> In this case, Yescrypt's lead becomes significant, with an average of 3 I/O
> operations per memory location,

Why 3 I/O's per memory location?  For p = 1, t = 0, it's 8/3, which is
below 3.  For p = 1, t = 1, it's 10/3, which is above 3.  For p > 1 the
number of I/O's reduces, as the final multi-threaded accesses to the
shared arena are read-only.  For t = 0 and high p, it's asymptotically 7/3.

Anyway, these are close to 3.

> compared to Lyra2's 6  This should lead to
> a 2X advantage in Yescrypt's peak memory*time cost.  My old entry had 2
> read/write operations per memory block :-)  However, in Scrypt compatible
> mode, Yescrypt does only 2 read/write operations per block, and is
> equivalent in speed to my old algorithm, though mine had better TMTO
> resistance in this mode.

As you're aware, scrypt deliberately has no TMTO resistance, and I
intentionally preserved this property in YESCRYPT_WORM mode.

Alexander