lists.openwall.net  lists / announce owlusers owldev johnusers johndev passwdqcusers yescrypt popa3dusers / osssecurity kernelhardening musl sabotage tlsify passwords / cryptdev xvendor / Bugtraq FullDisclosure linuxkernel linuxnetdev linuxext4 linuxhardening PHC  
Open Source and information security mailing list archives
 

Date: Wed, 1 Apr 2015 22:20:42 +0200 From: Dmitry Khovratovich <khovratovich@...il.com> To: "discussions@...swordhashing.net" <discussions@...swordhashing.net> Subject: Re: [PHC] Compute time hardness There is also a PhD thesis on the topic http://infolab.stanford.edu/pub/cstr/reports/csl/tr/94/617/CSLTR94617.pdf But anyway, as 32bit multiplication is equivalent to a sum of 16 (average) addends, it can not be more than 4x slower given sufficient area (which we have). > On 1 Apr 2015, at 20:12, Samuel Neves <sneves@....uc.pt> wrote: > >> On 01042015 18:07, Solar Designer wrote: >>> On Wed, Apr 01, 2015 at 06:36:32PM +0300, Solar Designer wrote: >>>> On Wed, Apr 01, 2015 at 08:20:29AM 0700, Bill Cox wrote: >>>> Alexander's technique of counting sequential operations, and giving >>>> multiply more weight seems reasonable to me, though he's once again >>>> undercutting himself by only claiming a multiply is worth 3 additions. >>>> Use 8 and you'll be closer to reality, IMO. >>> I agree. All we know is it's somewhere between 3 and 32. >> OK, it's obviously not 32. If we count only gate delays, then it's easy >> to see that we need one set of 1024 parallel 1bit ANDs for the partial >> products, and then 5 sequential groups of (equivalents of) 32bit ADDs. >> So that's a total of 1 AND + 5 ADD. But there might be wire delays >> greater than those in a circuit with a more regular structure. In fact, >> I found a patent (expiring soon) that appears to focus on optimizing the >> structure and it gives "a propagation delay of only 7.5 full adders" for >> a 32x32 multiplier (and 10.5 for 58x58, 11.5 for 61x61). It isn't >> entirely clear (at least without actually reading and understanding the >> whole thing, which I did not) what exactly they mean by a "full adder" >> here (same bit width as the multiplier's operands? something else?) > > You can certainly do the n^2 ANDs and add things up, but that seems like a waste of gates. What is the cost metric being > considered here? Time (circuit depth)? Area? Time * Area? > > As a theoretical curiosity, Brent and Kung [1] came up with an optimal areatime nbit multiplier of depth O(n^(1/2) log > n) and area O(n log n). MelhornPreparata's [2] circuit would be O(log n), O(n^2 / log n). The above would be > (asymptotically) depth O(log n) and area O(n^2). > > [1] http://mathspeople.anu.edu.au/~brent/pub/pub055.html > [2] http://www.sciencedirect.com/science/article/pii/S0019995883800618 > >
Powered by blists  more mailing lists