lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 31 Mar 2015 02:18:02 +0300
From: Solar Designer <>
Subject: Re: [PHC] Argon2

Dmitry, Bill -

On Mon, Mar 30, 2015 at 01:16:04PM -0700, Bill Cox wrote:
> Yescrypt (with 2 out of 3 PWXFORM rounds commented out): 0m0.668s

You probably mean "with 4 out of 6 PWXFORM rounds commented out" (that's
2 out of 3 double-round lines in the current yescrypt-simd.c file), so
leaving it with 2 rounds.

> Argon2: 0m0.746s
> Lyra2: 0m1.126s

Overall, Argon2 appears to be very well balanced, unlike the original
Argon, which appeared to be TMTO-obsessed.

The major exception here is Argon2's excessive instruction-level
parallelism, which possibly makes it ~16x weaker.  I've commented on
this below.

> Is Argon2d in the running, or only the revised
> version of the original Argon?

Only the tweaked original Argon is in PHC.

> For comparison, here's Yescrypt running it's PHS function on 1GiB, but with
> 1/3 of the number of rounds as usual.  This is, if I am not mistaken,
> roughly equivalent in complexity to 2 Blake2b rounds:

2 rounds of pwxform vs. 2 rounds of Blake2b?

Depends on what you mean by complexity.  I think one round of pwxform is
actually simpler than one round of Blake2b in terms of specification
complexity.  (This does not include specification of how the pwxform
S-boxes are initialized, though.)  They're similar in terms of the
number of instructions needed per byte processed.  However, one round of
pwxform is likely much higher latency than one round of Blake2b (and
larger as well, but this is usually unimportant) in hardware.

Note that your yescrypt benchmark also invokes 8 rounds of Salsa20 per
1024-byte block.  That's equivalent to 1/2 a round of Salsa20 per 64-byte
sub-block, so you're actually comparing 2 rounds of pwxform + 0.5 rounds
of Salsa20 vs. 2 rounds of Blake2b in Argon2.  I am lucky yescrypt is on
par with Argon2 in this test, despite of doing maybe 1.25x more computation
per byte (and containing a lot less parallelism, see below).  (And even
more than that if we consider hardware rather than software.)

> On Mon, Mar 30, 2015 at 3:18 AM, Dmitry Khovratovich <
> > Cryptographers can be interested in the new 8192-bit permutation we
> > designed

That's cool, but that's 8192-bit parallelism, right?  Unless a defender
makes use of it, this makes Argon2 like 8192/512 = 16 times weaker than
yescrypt at default settings against ASICs.  In yescrypt, the 512-bit
sub-blocks would be processed sequentially, so e.g. in Bill's testing
it's 2*16 = 32 sequential rounds of pwxform (with their latencies) per
kilobyte.  For Argon2, it's just 2 sequential rounds of Blake2.  Right?

(This is on top of the ASIC latency difference of individual rounds of
these primitives.)

> > Argon2 has two variants: Argon2d and Argon2i. Argon2d is faster and
> > uses data-depending memory access, which makes it suitable for
> > cryptocurrencies and applications with no threats from side-channel
> > timing attacks. Argon2i uses data-independent memory access, which is
> > preferred for password hashing and password-based key derivation.
> > Argon2i is slower as it makes more passes over the memory to protect
> > from tradeoff attacks (3 passes by default comparing to 1 default pass
> > in Argon2d).

This i vs. d separation and the numbers of passes sound right to me.

Like Bill, I think Argon2d may be more appropriate for many KDF uses as
well, though.

> > Webpage:
> > Specification:
> > Implementation:
> >
> > Comments are welcome.

Overall, Argon2 looks very nice, but the 8192-bit parallelism looks
crazy. ;-)



Powered by blists - more mailing lists