lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 16 Jul 2015 10:49:52 -0700
From: Bill Cox <waywardgeek@...il.com>
To: "discussions@...sword-hashing.net" <discussions@...sword-hashing.net>
Subject: Re: [PHC] Bandwidth hardened algorithms?

On Thu, Jul 16, 2015 at 3:10 AM, Dmitry Khovratovich <khovratovich@...il.com
> wrote:

> Am I correct that the ROM is used for many subsequent blocks?
>

Yes.


> Then, the miner generates (i1,i2) <- SHA256(Header,Nonce),
> computes/retrieves X1=Yescrypt_4MB (i1), X2 = Yescrypt_4MB (i2), and checks
> if SHA256(X1,X2,Header,Nonce) <Target.
>

I did not make a key point clear enough:  The miner computes a Yescrypt
hash of only 8 KiB, not 4 MiB.  This is critical.  A CPU based miner with
32 GiB of "ROM" data in RAM would compute X =
Yescrypt_8_KiB(4KiB_block(i1), 4KiB_block(i2)), and check if X < Target.


> So this seems to be a scheme with fast variant that needs 32 GB and slower
> variant that lives with 4 MB. And the verification needs 4 MB, too.
>

Yes.  The key idea here is that we need fast verification for two different
groups, and they have different needs.  A miner or "full node" verifies the
entire block chain, and doing a full 8-second hash per block would be
intolerable (the time to fill the ROM is about that on a common desktop).
However, after filling memory once, verification can proceed at over a
million verification per second, because only 8 KiB of data needs to be
read from the ROM for each.


> To figure out the ASIC advantage, we should compare the hash rate of a
> desktop with 32 GB of RAM with the hash rate of 2^13 (disconnected) ASICs
> each having 4MB on chip. In your calculations desktops do 2^16 hashes per
> second, and the ASICs would do 2^25 hashes per second, so about
> 4000x-advantage.
>
> Dmitry
>

Disconnected ASICs would lose quite badly, since the CPU miner is reading 8
KiB per hash at 20 GiB/s, while a disconnected ASIC must compute 2 4-MiB
Yescrypt hashes in series.  We chose these computations to be 1 ms (by
adjusting t_cost) for the CPU defender.  There is no way the ASIC can do
this more than 10X faster, so each hash will require (1ms + 1ms)/10 =
0.2ms.  This is only 5,000 hashes per second.  In comparison, the CPU
defender fills 20GiB of memory bandwidth reading 8KiB of memory per hash.
This is over 2.6 million hashes per second.  The CPU defender is over 500X
faster than the 4MiB ASIC.

A more interesting ASIC attack is when we have 8192 ASICs with 4MiB
networked together.  In this case, each ASIC holds 4 MiB of the ROM data,
for a total of 32 GiB.  To compute Yescrypt(4KiB_block(i1),
4KiB_block(i2)), ASIC_i2 must transmit 4KiB_block(i1) to ASIC_i1.  This is
only 4KiB transmitted, compared to the CPU/DRAM case where the CPU reads
8KiB.  However, to be comparable in speed to the CPU case, the ASICs need a
routing network capable of 8192*10GiB/s ==> 8192*10GiB/s out.  The total
bandwidth of this random-routing network is over 80 TiB/s.  The challenge
is to design the router so that it does not dominate the cost.

At a minimum, the routing network requires 10 times 10-gigabit pins per
ASIC.  An Achronix FPGA has 64 of these pins, and in theory costs around
$600.  However, they also have to connect to other nodes in the routing
fabric.  I do not know how to do this routing without several layers of
these FPGAs.  Their cost would dominate in my designs.

A custom routing ASIC which maximizes bandwidth while minimizing cost would
work out better.  If I remember correctly, GDDR5 DRAMs have about 20 GiB/s
bandwidth and only cost around $11 for the small ones at this speed.  If we
could for $10 build a routing ASIC with 10GiB/s in and out, we'd
cost-reduce the router considerably.  Maybe it could be done for $100 per
hashing-ASIC node.

This is what the threat landscape looks like for bandwidth-hardening PoW,
SFAICT.

Bill

Content of type "text/html" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ