lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAS2fgTK_S2XMrvGbhyqd+KuuLVFhOrziFsYv+0EePngtgOuqg@mail.gmail.com>
Date: Wed, 1 Apr 2015 15:48:40 +0000
From: Gregory Maxwell <gmaxwell@...il.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] Compute time hardness

On Wed, Apr 1, 2015 at 3:20 PM, Bill Cox <waywardgeek@...il.com> wrote:
> This is pretty difficult to guestimate, since we don't seem to have any
> highly experienced deep submicron IC designers involved on this list.  Maybe
> we should try to rope one in from Intel, AMD, or NVDIA?
>
> We see good evidence that a multiply is at least 3X more computationally
> expensive.  If it we not, Intel would have a lower latency multiplier.  Most
> of the delay around an ADD operation is in moving the data around, not doing
> addition, so there's an additional speed factor we are not sure of.  I think
> it will be pretty high, like 8X or 16X compared to a 32x32 -> 64 multiply.

There are standards for this, e.g. ITU basicops. But they're really
quite rubbish and I can pretty much promise that your random guesses
will be better.

It's not clear if the costs I was asking about were e.g. more gate
area like or more power like or more latency like-- they're all
different when you get into the details.

Was I was curious about was mostly if there were any large separations
among the leaders of the high memory fill-rate/bandwidth algorithms;
e.g. is one doing a much better job of making use of processor
resources than another which is otherwise close to in terms of
bandwidth. If there is a large separation it may not matter how its
counted.

(Rationale being that among algorithms which are relatively closely
matched in their ability to use memory-area and bandwidth one may
prefer the algorithm that requires more computation (however you
measure it) per unit wall-clock as a hedge against the economics of
attacking the memory area/bandwidth changing in the future.)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ