phc-discussions - Re: [PHC] Re: Asymmetric proof-of-work based on the Generalized Birthday problem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOLP8p4YO+4C_eEwZsBVZU1eag5ft-qVkOxyj82KK+yjDGJvYg@mail.gmail.com>
Date: Wed, 30 Sep 2015 09:30:15 -0700
From: Bill Cox <waywardgeek@...il.com>
To: "discussions@...sword-hashing.net" <discussions@...sword-hashing.net>
Subject: Re: [PHC] Re: Asymmetric proof-of-work based on the Generalized
 Birthday problem

This is excellent work, IMO, by both Cuckoo Cycle and this new algorithm
based on Wagner's Generalize Birthday attack.  Hopefully some comments
below will be considered constructive, as I intend them.

On Wed, Sep 30, 2015 at 6:40 AM, John Tromp <john.tromp@...il.com> wrote:

> dear Dmitry,
>
> > Comments are welcome.--
>
> Your statements on Cuckoo Cycle appear to be based
> on an obsolete version of the paper.
>

I have verified this to some extent myself.  Cuckoo Cycle appears to be
bandwidth-hard, just like the new algorithm.  Dmitry has a tendency to show
competing algorithms in an unfavorable light.  If I could change just one
thing in Dmitry's excellent work, it would be to have him show competitor's
work without so much bias.

My main concern with both Cuckoo Cycle and this new algorithm is that the
bandwidth both use is too low.  The latest Radeon R9 Nano GPU card does 475
GiB/s.  Any new PoW which aims for ASIC resistance based on
bandwidth-hardness needs to take this new development into account.
Stacked-die GRAM with through-hole vias is a game-changer, and if you make
use of it, it will dramatically improve your ASIC resistance.  A very
simple way to see how badly an ASIC is likely to beat your algorithm is
this formula:

   ASIC_advantage = ASIC_bandwidth / your_bandwidth

I know that's massively over-simplified, ignoring factors like power, but
it's OK as a first order approximation.

For your bandwidth to achieve strong ASIC resistance, I think you need to
tune your algorithms for GPUs, rather than CPUs.  The 475 GiB/s of the Nano
card for the given power is just too high compared to what CPU miners will
achieve.

The current state of the art is Ethash, used in Etherium.  I've not yet
gotten permission to publish more detail on my GPU Hog algorithm, but let
me just say it looks straight-forward to make full use of the 475 GiB/s of
the Nano.  Ethash only achieves 50 GiB/s.  So, please go about crushing
Ethash's ASIC resistance :)

The strongest ASIC attack I see against Ethash is simply making use of a
high-end ASIC's bandwidth to beat Ethash's fairly low 50GiB/s.  This attack
is a no-brainer and simply requires a fat wallet.  The second strongest
attack I see is to mount a massively parallel generation of the tiny
portions of the 4 GiB data needed per nonce, from a 16 MiB on-chip cache.
Both attacks require very expensive ASICs, but both result in the ASIC
winning by a 10X-ish factor.  A third attack is to incorporate flash
directly on the ASIC.  There are $29 microcontrollers now with 4 GiB of
flash, so this seems like a viable attack.  If the ASIC manages to solve
the bandwidth problem to on-chip flash, which seems doable, this could be a
devastating attack.

Any state-of-the-art algorithm should defeat these attacks.  It's as much a
matter of good engineering as mathematics.  Just think "475 GiB/s" and go
make it happen.  With the coders who wrote Argon2, I'm sure Dmitry's team
has the required skills.

One more comment: Don't focus too much on TMTO resistance.  There is a TMTO
resistance vs low-memory verification speed trade-off.  You shouldn't
damage verification time by 10X in order to punish a 2X TMTO attack with a
100X bandwidth penalty.  TMTO attacks just need to increase bandwidth, and
even 2X is enough to defeat TMTO ASIC attacks.  If to recompute a missing
block in a 2X TMTO attack, you need to fetch two from memory, you're in
good shape.  TMTO resistance in itself is not the end-goal.

Bill

Content of type "text/html" skipped