lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <555D591B.4030707@dei.uc.pt>
Date: Thu, 21 May 2015 05:03:39 +0100
From: Samuel Neves <sneves@....uc.pt>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] GPU vs CPU benchmarks for Makwa

On 05/18/2015 11:39 PM, Thomas Pornin wrote:
> Hello,
>
> I have made some OpenCL implementations for Makwa (actually, for the
> modular squarings where Makwa spends most of its time), optimized for a
> Radeon HD 7990 GPU ("Tahiti" devices). I also compared the resulting
> performance with what can be achieved with an Intel i7 4770K (Haswell
> core, implementation uses AVX2 opcodes).
>
> The comparison report is there:
>    http://www.bolet.org/makwa/makwa-gpu-20150518.pdf
>
> The OpenCL code can be downloaded here:
>    http://www.bolet.org/makwa/Makwa-OpenCL-20150518.tar.gz
>

Thanks, this is excellent work!

>
> Report highlights:
>
>  -- I get 31.8 millions of modular squarings per second on the GPU.
>
>  -- On the CPU, I can do 5.45 millions of modular squarings per second.
>
>  -- When taking into account hardware cost and energy consumption (which
>  is not as easy as it seems), the GPU turns out to be 1.72 times more
>  efficient than the CPU.

I think this is underselling the fact that the GPU code is entirely "portable" C, whereas the CPU code is highly
optimized assembly. I don't know the AMD toolchain enough to know whether this makes any difference, but on old CUDA I
vaguely recall getting a ~15k/s -> ~30k/s performance boost from switching from C to inline PTX assembly on a few key
routines. So there is some potential for the GPU to get a modest advantage factor, I think.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ