[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <555D591B.4030707@dei.uc.pt>
Date: Thu, 21 May 2015 05:03:39 +0100
From: Samuel Neves <sneves@....uc.pt>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] GPU vs CPU benchmarks for Makwa
On 05/18/2015 11:39 PM, Thomas Pornin wrote:
> Hello,
>
> I have made some OpenCL implementations for Makwa (actually, for the
> modular squarings where Makwa spends most of its time), optimized for a
> Radeon HD 7990 GPU ("Tahiti" devices). I also compared the resulting
> performance with what can be achieved with an Intel i7 4770K (Haswell
> core, implementation uses AVX2 opcodes).
>
> The comparison report is there:
> http://www.bolet.org/makwa/makwa-gpu-20150518.pdf
>
> The OpenCL code can be downloaded here:
> http://www.bolet.org/makwa/Makwa-OpenCL-20150518.tar.gz
>
Thanks, this is excellent work!
>
> Report highlights:
>
> -- I get 31.8 millions of modular squarings per second on the GPU.
>
> -- On the CPU, I can do 5.45 millions of modular squarings per second.
>
> -- When taking into account hardware cost and energy consumption (which
> is not as easy as it seems), the GPU turns out to be 1.72 times more
> efficient than the CPU.
I think this is underselling the fact that the GPU code is entirely "portable" C, whereas the CPU code is highly
optimized assembly. I don't know the AMD toolchain enough to know whether this makes any difference, but on old CUDA I
vaguely recall getting a ~15k/s -> ~30k/s performance boost from switching from C to inline PTX assembly on a few key
routines. So there is some potential for the GPU to get a modest advantage factor, I think.
Powered by blists - more mailing lists