lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151007113814.GA23226@bolet.org>
Date: Wed, 7 Oct 2015 13:38:14 +0200
From: Thomas Pornin <pornin@...et.org>
To: discussions@...sword-hashing.net
Cc: Solar Designer <solar@...nwall.com>
Subject: Re: [PHC] yescrypt on GPU

On Wed, Oct 07, 2015 at 09:29:09AM +0200, Massimo Del Zotto wrote:
> I have been told on the AMD CL forum (by 'realhet', who appears very
> proficient and up to date with GCN ASM) that GCN has no instruction
> latencies (i.e. it can consume a result in the instruction immediately
> following), probably a nice implication of the instructions being processed
> in 4-clock 'ticks'.
> This is mildly contrasting with my experience but since I go through the CL
> compiler, I'm inclined to believe him/her.

If you invoke the CL compiler with the '-save-temps' option (e.g. in the
clBuildProgram() call), then you will get a dump of the intermediate
representations, one in IL (the AMD "intermediate language") and one in
ISA (the actual assembly for the GCN device). I recommend having a look
at the latter, which is what the GPU will really work on.

In my experience, the "no latency" rule is _mostly_ true, but there are
a few instructions with a higher latency (especially multiplications on
32-bit operands and on double-width floating-point values). Memory
accesses can also incur extra latency if there is contention; and the
compiler may emit extra instructions in some cases, in particular when
it decides that it has run out of registers and needs to spill some of
them to RAM (and that's _global_ RAM, so spilling is a big performance
killer). You have to look at the ISA to know whether spilling occured or
not.


	--Thomas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ