phc-discussions - Re: [PHC] Compute time hardness

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date: Sun, 5 Apr 2015 12:58:00 +0300
From: Solar Designer <solar@...nwall.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] Compute time hardness

On Fri, Apr 03, 2015 at 01:19:49PM +0300, Solar Designer wrote:
> On Thu, Apr 02, 2015 at 02:55:46PM +0300, Solar Designer wrote:
> > Can we estimate the number of transistors needed for a naive multiplier
> > (minimizing solely the circuit depth)?  60,000 / 1024 partial product
> > bits gives us 60 transistors per bit.  Perhaps half of them would be
> > spent on the first set of additions (since 1024 is close to
> > 512+256+128+64 = 960 inputs to further additions).  If so, we have 30
> > transistors to spend on one full adder, which is more than we need -
> > which is 8 to 22, from what I could find.  Of course, we wouldn't use
> > ripple-carry adders, but 30 transistors per bit feels about right for a
> > carry look-ahead adder.
> 
> I made an error here.  1024 bits need only 512 full adders for the
> first set of additions, so we have 60 transistors per full adder there.
> 
> > So the circuit size appears on par with the naive one (and their fastest
> > one is probably larger than the naive one).  Clearly, so much effort was
> > put into this not just for the sake of weirdness.
> 
> With the above correction, my estimate is that the 60,000 transistor
> circuit is maybe twice larger than a naive one would have been, and
> the fastest circuit from this PhD thesis is 3x larger than a naive one.

I made another error there.  When writing the above, I forgot that those
multipliers are 53x53 rather than 32x32.  Since 53^2/(32^2) = ~2.74, the
fastest multiplier from that PhD thesis actually feels on par with a
naive one now, die area wise.

Also, my "960 inputs to further additions" is slightly wrong since those
later additions are wider than 32-bit, but this doesn't affect the
estimates significantly since the additions only get much wider when
relatively few are needed.

Sorry for so many errors and corrections.

Alexander