phc-discussions - GPU multiplication speed?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOLP8p4a5tHOr6fsTtjeL-Twei-i29UrEAihKs+858J5v1-iwQ@mail.gmail.com>
Date: Thu, 27 Feb 2014 07:20:45 -0500
From: Bill Cox <waywardgeek@...il.com>
To: discussions@...sword-hashing.net
Subject: GPU multiplication speed?

I'm trying to figure out how effective GPUs could be against
multiplication based compute-time hardened KDFs.  A modern CPU does  a
multiply in about 1ns.  If I am not mistaken, a GPU takes 4 or more
clocks to do an integer multiply, and they run no faster than about
1GHz, meaning at least 4ns per multiply.

If a defender hashes his password using 2GiB in 0.5 seconds on his PC
in a multiplication limited loop (easily done on 1 thread on a 3GHz+
Sandy Bridge, Ivy Bridge, or Haswell CPU and a single bank of 1,666MHz
DDR3 RAM), and an attacker has an 8GiB graphics card (do they sell
bigger?), then the attacker could only do 2 guesses in parallel, and
each guess will take 4 times longer, at about 2 seconds per guess, or
1 guesses per second total throughput.  That's half what the CPU can
do.

Are my numbers right?  My Google-fu is failing me this morning.  I
could not find a table of NVDA or ATI GPU instruction latencies.  I
did find one post suggesting to use 24x24 multiplication because it's
only 4 clocks, but there was no mention of 32x32 latencies.

Bill