lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 03 Sep 2014 10:59:44 -0400
From: Bill Cox <>
Subject: Re: [PHC] Re: Tradeoff cryptanalysis of password hashing schemes

Hash: SHA1

On 09/03/2014 10:30 AM, Dmitry Khovratovich wrote:
> Some more details for the ASIC discussion.
> 1) An ASIC-equipped adversary will aim to minimize the running
> costs, which on the long term will be dominated by the power
> consumption and cooling.

I don't agree.

I have a simple on-paper ASIC attack assuming the attacker is going
high-end, using Intel's same process as my Ivy Bridge processor.  In
that case, with the expensive flip-chip packaging required for this
power and GDDR5 interfaces, I estimated the cost of the ASIC at $350
each in fairly high volume, similar to a reasonably high-end CPU made
in this process.  I also estimated the 16GiB GDDR5 DRAM at $150, for a
$500, plus another $100 for power supply and building a board.  I
estimated the whole system power at about 118W, with 70W in the ASIC
(similar to an Intel CPU), and the total cost of ownership for 5 years
came out to about $1,100.

The power budge came out just under the hardware budget.  It's not
clear to me that power is going to dominate, but it is a major
component, sometimes over half, but not always.

The NRE for such a device is probably about $10M, so you'd better be
buying several 10's of thousands of boards, preferably 100's of
thousands.  This is really a government-scale attack.

> 2) A high-budget adversary will not restrict himself to
> commercially available memory chips only, but will definitely
> consider custom designs.

I agree, but only for government-scale attacks.  It's one thing to pay
Intel enough cache to get your chip in their fab.  It's a very
different thing to start mucking with their processor directly, which
is their bread-and-butter.  Same goes for Samsung GDDR5 chips.

> 3) The memory power consumption (roughly) consists of retention
> power (to sustain the state) and active energy (to read or write).
> 4) Regarding active energy, it would be natural to prefer a memory 
> architecture that consumes 0.0001 J to read 1 Gbit (one of SRAM
> designs - [1]) to the one that consumes 0.5J to read 1 Gbit
> (DDR5).
> 5) I did not calculate the retention power for DDR5, but to match
> [1] ( 0.1 W ) it must be around 3% of the maximum power
> consumption, not speaking of 0.0001 W for the design [2].

If you think you have a better way to design 4GiB ultra-fast DRAM
chips better than Samsung, be my guest, but I don't buy it.  Go pick a
real commercial DRAM part, and use it's specs for your estimate,
rather than guessing what you think they *should* be building.  Trust
me, if they thought they could lower the power even 2X, and maintain
the density and speed, they would!

> 6) There are other, low-power DRAM designs, that can be considered
> here.

True, and some might have lower total cost of ownership per number of
broken passwords.  It would require a lower-speed lower-power ASIC
attack.  This might be more cost effective than the high-end attack
I've been considering.  However, I doubt that the ASIC power will
dominate, or the DRAM power will dominate, or the hardware cost will
dominate, or the power cost will dominaate.  If any of those things
were true, clever engineers would engineer a better solution.

> 7) Large on-chip memory is needed mainly to reduce the latency.
> However, for the schemes currently attacked this is not a problem.
> Catena has memory-independent addressing, and Lyra2 has huge blocks
> sequentially stored in the memory. As a result, off-chip memory
> with latency up to 10-15 cycles is still suitable for the tradeoff
> attacks as the latency of the entire scheme remains pretty much the
> same.

This is true for Catena, Lyra2, TwoCats, Yescrypt, and others.  It is
not true for Argon, where a specially crafted cache architecture h
speed up cache-bound hashing, which is what most Argon hashes would
be.  Catena, based on the full Blake2b hash, would run slowly enough
to make having on-chip cache pointless, but that's a Catena problem.

> 8) If we consider DRAM-restrictive adversaries, then Lyra2 can be
> run with 1/2 of memory with no energy penalty: the increase of
> memory reads from 6 GB to 7.6 GB per password and 20% increase in
> the running time is compensated by the 50% memory reduction. It may
> even be that running Lyra2 with 1/2 of memory takes less energy
> than with full memory.

I'll take a closer look at Lyra2 when I get to it's review.  I don't
see this as a major problem for Lyra2, assuming these numbers are
right.  Once Lyra2 is multi-threaded, it should easily max out the
external memory bandwidth.  My high-end ASIC attack would not benefit
from a 1/2 TMTO against a multi-threaded Lyra2, because it will be
memory bandwidth limited against Lyra2, doing about 32X faster than my
Ivy Bridge PC (12GiB/s banwidth for my PC vs 16X24GiB/s for my ASIC).

I would prefer some better compute-time hardening in Lyra2 though, for
protection of very small memory hashes that do fit on an ASIC.

Version: GnuPG v1


Powered by blists - more mailing lists