lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-Id: <D89189E1-19FA-4386-BD0F-DCF109815B0D@gmail.com> Date: Wed, 1 Apr 2015 22:20:42 +0200 From: Dmitry Khovratovich <khovratovich@...il.com> To: "discussions@...sword-hashing.net" <discussions@...sword-hashing.net> Subject: Re: [PHC] Compute time hardness There is also a PhD thesis on the topic http://infolab.stanford.edu/pub/cstr/reports/csl/tr/94/617/CSL-TR-94-617.pdf But anyway, as 32-bit multiplication is equivalent to a sum of 16 (average) addends, it can not be more than 4x slower given sufficient area (which we have). > On 1 Apr 2015, at 20:12, Samuel Neves <sneves@....uc.pt> wrote: > >> On 01-04-2015 18:07, Solar Designer wrote: >>> On Wed, Apr 01, 2015 at 06:36:32PM +0300, Solar Designer wrote: >>>> On Wed, Apr 01, 2015 at 08:20:29AM -0700, Bill Cox wrote: >>>> Alexander's technique of counting sequential operations, and giving >>>> multiply more weight seems reasonable to me, though he's once again >>>> under-cutting himself by only claiming a multiply is worth 3 additions. >>>> Use 8 and you'll be closer to reality, IMO. >>> I agree. All we know is it's somewhere between 3 and 32. >> OK, it's obviously not 32. If we count only gate delays, then it's easy >> to see that we need one set of 1024 parallel 1-bit ANDs for the partial >> products, and then 5 sequential groups of (equivalents of) 32-bit ADDs. >> So that's a total of 1 AND + 5 ADD. But there might be wire delays >> greater than those in a circuit with a more regular structure. In fact, >> I found a patent (expiring soon) that appears to focus on optimizing the >> structure and it gives "a propagation delay of only 7.5 full adders" for >> a 32x32 multiplier (and 10.5 for 58x58, 11.5 for 61x61). It isn't >> entirely clear (at least without actually reading and understanding the >> whole thing, which I did not) what exactly they mean by a "full adder" >> here (same bit width as the multiplier's operands? something else?) > > You can certainly do the n^2 ANDs and add things up, but that seems like a waste of gates. What is the cost metric being > considered here? Time (circuit depth)? Area? Time * Area? > > As a theoretical curiosity, Brent and Kung [1] came up with an optimal area-time n-bit multiplier of depth O(n^(1/2) log > n) and area O(n log n). Melhorn-Preparata's [2] circuit would be O(log n), O(n^2 / log n). The above would be > (asymptotically) depth O(log n) and area O(n^2). > > [1] http://maths-people.anu.edu.au/~brent/pub/pub055.html > [2] http://www.sciencedirect.com/science/article/pii/S0019995883800618 > >
Powered by blists - more mailing lists