[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150330231801.GA31544@openwall.com>
Date: Tue, 31 Mar 2015 02:18:02 +0300
From: Solar Designer <solar@...nwall.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] Argon2
Dmitry, Bill -
On Mon, Mar 30, 2015 at 01:16:04PM -0700, Bill Cox wrote:
> Yescrypt (with 2 out of 3 PWXFORM rounds commented out): 0m0.668s
You probably mean "with 4 out of 6 PWXFORM rounds commented out" (that's
2 out of 3 double-round lines in the current yescrypt-simd.c file), so
leaving it with 2 rounds.
> Argon2: 0m0.746s
> Lyra2: 0m1.126s
Overall, Argon2 appears to be very well balanced, unlike the original
Argon, which appeared to be TMTO-obsessed.
The major exception here is Argon2's excessive instruction-level
parallelism, which possibly makes it ~16x weaker. I've commented on
this below.
> Is Argon2d in the running, or only the revised
> version of the original Argon?
Only the tweaked original Argon is in PHC.
> For comparison, here's Yescrypt running it's PHS function on 1GiB, but with
> 1/3 of the number of rounds as usual. This is, if I am not mistaken,
> roughly equivalent in complexity to 2 Blake2b rounds:
2 rounds of pwxform vs. 2 rounds of Blake2b?
Depends on what you mean by complexity. I think one round of pwxform is
actually simpler than one round of Blake2b in terms of specification
complexity. (This does not include specification of how the pwxform
S-boxes are initialized, though.) They're similar in terms of the
number of instructions needed per byte processed. However, one round of
pwxform is likely much higher latency than one round of Blake2b (and
larger as well, but this is usually unimportant) in hardware.
Note that your yescrypt benchmark also invokes 8 rounds of Salsa20 per
1024-byte block. That's equivalent to 1/2 a round of Salsa20 per 64-byte
sub-block, so you're actually comparing 2 rounds of pwxform + 0.5 rounds
of Salsa20 vs. 2 rounds of Blake2b in Argon2. I am lucky yescrypt is on
par with Argon2 in this test, despite of doing maybe 1.25x more computation
per byte (and containing a lot less parallelism, see below). (And even
more than that if we consider hardware rather than software.)
> On Mon, Mar 30, 2015 at 3:18 AM, Dmitry Khovratovich <khovratovich@...il.com
> > Cryptographers can be interested in the new 8192-bit permutation we
> > designed
That's cool, but that's 8192-bit parallelism, right? Unless a defender
makes use of it, this makes Argon2 like 8192/512 = 16 times weaker than
yescrypt at default settings against ASICs. In yescrypt, the 512-bit
sub-blocks would be processed sequentially, so e.g. in Bill's testing
it's 2*16 = 32 sequential rounds of pwxform (with their latencies) per
kilobyte. For Argon2, it's just 2 sequential rounds of Blake2. Right?
(This is on top of the ASIC latency difference of individual rounds of
these primitives.)
> > Argon2 has two variants: Argon2d and Argon2i. Argon2d is faster and
> > uses data-depending memory access, which makes it suitable for
> > cryptocurrencies and applications with no threats from side-channel
> > timing attacks. Argon2i uses data-independent memory access, which is
> > preferred for password hashing and password-based key derivation.
> > Argon2i is slower as it makes more passes over the memory to protect
> > from tradeoff attacks (3 passes by default comparing to 1 default pass
> > in Argon2d).
This i vs. d separation and the numbers of passes sound right to me.
Like Bill, I think Argon2d may be more appropriate for many KDF uses as
well, though.
> > Webpage: https://www.cryptolux.org/index.php/Argon2
> > Specification: https://www.cryptolux.org/images/0/0d/Argon2.pdf
> > Implementation: https://github.com/khovratovich/Argon2
> >
> > Comments are welcome.
Overall, Argon2 looks very nice, but the 8192-bit parallelism looks
crazy. ;-)
Thanks,
Alexander
Powered by blists - more mailing lists