phc-discussions - Flaw in Argon2 TMTO ASIC analysis

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOLP8p5rFT4gHMQE6GOv23_dFZ3S1XGDen-Qci3ehm77iWAgCw@mail.gmail.com>
Date: Fri, 28 Aug 2015 06:33:53 -0700
From: Bill Cox <waywardgeek@...il.com>
To: "discussions@...sword-hashing.net" <discussions@...sword-hashing.net>
Subject: Flaw in Argon2 TMTO ASIC analysis

I objected to this flawed analysis when applied to Lyra2 and Yescrypt, so
it's only fair I object when applied to Argon2 :)

The Argon2 paper states:

"We conclude that for data-dependent one-pass schemes the adversary is
always able to reduce the memory
by the factor of 4 and still keep the time-area product the same."

This is wrong.  Given their simple attack, there is no memory reduction
factor that results in any time-area product reduction.  The algorithm is
far more TMTO resistant than the authors claim (as is Lyra2).

In their attack, they assume the simple model where alpha is an integer and
we keep every 1/alpha memory location in external DRAM.  When using 1/2 the
memory (alpha = 1/2), and a uniform random index function (the square
function used now is better), the average recomputations, which are called
C(alpha), goes to 2 for large memory hashing.  Their table incorrectly
shows C(alpha) = 1.5.

The function D(alpha) is used to compute the additional runtime factor.
However, it is assumed to be no longer than the deepest computation path
required to recompute missing blocks.  This function D(alpha) is a clear
theoretical lower-bound on recomputation speed, but it is incorrect to use
this number in theiir time-area ASIC attack analysis.

The paper correctly states that the fastest ASIC bandwidths to memory are
on the order of 400 GiB/s.  An ASIC attack will optimize many factors, and
one of the most critical is ASIC cost for the resulting bandwidth.
Assuming this results in a 400 GiB/s ASIC (probably incorrect, but the
analysis is the same for other choices), the runtime is proportional to
C(alpha), the number of recomputations, not D(alpha), the computation tree
depth.  This is because the required bandwidth is increased by a factor of
C(alpha), and the ASIC will be bandwidth limited when attacking Argon2.  I
can think of no reasonable ASIC attack for large-memory hashing (>= 1 GiB)
that would be limited by any other speed factor, as the paper did not
include multiplication based compute-time hardening in this analysis.

Thus the time is proportional to C(alpha) and the area is proportional to
alpha.  The time-area product is alpha/C(alpha).  When using the corrected
C(alpha) for the simple attack (keeping every 1/alpha block), C(alpha) is
2, and at the 1/2 memory level time-area = (1/2)/2 = 1.  At lower alpha
(higher memory reduction factors), recomputations increase faster than the
memory reduction, increasing the area-time cost to the ASIC attacker.
Therefore, there is no choice for alpha that reduces the time-area product.

Improved attacks will keep less memory at high memory addresses and more at
lower addresses.  Against a uniform random index function, this can only
decrease the time-area product by about 15% at most.  With the improved
index function, the maximum TMTO improvement is even less.  I believe the
potential benefit is so small, no significant ASIC attack will ever bother
to do a TMTO against 1-pass Argon2d.

If this were the Argon2 team claiming a competitor had a simple attack
resulting in a 4X area-time reduction, I would be upset.  Since it is their
own algorithm they are talking about, I'll simply correct their math :)

Bill

Content of type "text/html" skipped