lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 1 Apr 2015 22:20:42 +0200
From: Dmitry Khovratovich <khovratovich@...il.com>
To: "discussions@...sword-hashing.net" <discussions@...sword-hashing.net>
Subject: Re: [PHC] Compute time hardness

There is also a PhD thesis on the topic http://infolab.stanford.edu/pub/cstr/reports/csl/tr/94/617/CSL-TR-94-617.pdf

But anyway, as 32-bit multiplication is equivalent to a sum of 16 (average) addends, it can not be more than 4x slower given sufficient area (which we have).



> On 1 Apr 2015, at 20:12, Samuel Neves <sneves@....uc.pt> wrote:
> 
>> On 01-04-2015 18:07, Solar Designer wrote:
>>> On Wed, Apr 01, 2015 at 06:36:32PM +0300, Solar Designer wrote:
>>>> On Wed, Apr 01, 2015 at 08:20:29AM -0700, Bill Cox wrote:
>>>> Alexander's technique of counting sequential operations, and giving
>>>> multiply more weight seems reasonable to me, though he's once again
>>>> under-cutting himself by only claiming a multiply is worth 3 additions.
>>>> Use 8 and you'll be closer to reality, IMO.
>>> I agree.  All we know is it's somewhere between 3 and 32.
>> OK, it's obviously not 32.  If we count only gate delays, then it's easy
>> to see that we need one set of 1024 parallel 1-bit ANDs for the partial
>> products, and then 5 sequential groups of (equivalents of) 32-bit ADDs.
>> So that's a total of 1 AND + 5 ADD.  But there might be wire delays
>> greater than those in a circuit with a more regular structure.  In fact,
>> I found a patent (expiring soon) that appears to focus on optimizing the
>> structure and it gives "a propagation delay of only 7.5 full adders" for
>> a 32x32 multiplier (and 10.5 for 58x58, 11.5 for 61x61).  It isn't
>> entirely clear (at least without actually reading and understanding the
>> whole thing, which I did not) what exactly they mean by a "full adder"
>> here (same bit width as the multiplier's operands? something else?)
> 
> You can certainly do the n^2 ANDs and add things up, but that seems like a waste of gates. What is the cost metric being
> considered here? Time (circuit depth)? Area? Time * Area?
> 
> As a theoretical curiosity, Brent and Kung [1] came up with an optimal area-time n-bit multiplier of depth O(n^(1/2) log
> n) and area O(n log n). Melhorn-Preparata's [2] circuit would be O(log n), O(n^2 / log n). The above would be
> (asymptotically) depth O(log n) and area O(n^2).
> 
> [1] http://maths-people.anu.edu.au/~brent/pub/pub055.html
> [2] http://www.sciencedirect.com/science/article/pii/S0019995883800618
> 
> 

Powered by blists - more mailing lists