lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 14 Nov 2014 16:46:18 +0100
From:	Hannes Frederic Sowa <hannes@...essinduktion.org>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	netdev@...r.kernel.org, ogerlitz@...lanox.com, pshelar@...ira.com,
	jesse@...ira.com, jay.vosburgh@...onical.com,
	discuss@...nvswitch.org
Subject: Re: [PATCH net-next] fast_hash: clobber registers correctly for
 inline function use

On Fr, 2014-11-14 at 07:33 -0800, Eric Dumazet wrote:
> On Fri, 2014-11-14 at 16:13 +0100, Hannes Frederic Sowa wrote:
> > > 
> > > 
> > > Thats a lot of clobbers.
> > 
> > Yes, those are basically all callee-clobbered registers for the
> > particular architecture. I didn't look at the generated code for jhash
> > and crc_hash because I want this code to always be safe, independent of
> > the version and optimization levels of gcc.
> > 
> > > Alternative would be to use an assembly trampoline to save/restore them
> > > before calling __jhash2
> > 
> > This version provides the best hints on how to allocate registers to the
> > optimizers. E.g. it could avoid using callee-clobbered registers but use
> > callee-saved ones. If we build a trampoline, we need to save and reload
> > all registers all the time. This version just lets gcc decide how to do
> > that.
> > 
> > > __intel_crc4_2_hash2 can probably be written in assembly, it is quite
> > > simple.
> > 
> > Sure, but all the pre and postconditions must hold for both, jhash and
> > intel_crc4_2_hash and I don't want to rewrite jhash in assembler.
> 
> We write optimized code for current cpus.
> 
> With current generation of cpus, we have crc32 support.

__intel_crc4_2_hash(2) does already make use of crc32 instruction. I'll
have a closer look at what gcc generates.

> The fallback having to save/restore few registers, we don't care, as the
> fallback has huge cost anyway.
> 
> You don't have to write jhash() in assembler, you misunderstood me.

Ok, understood, so we only clobber the registers needed in the
crc32_hash implementation and only if we branch to jhash we save all the
other ones in a trampoline directly before jhash.

> We only have to provide a trampoline in assembler, with maybe 10
> instructions.
> 
> Then gcc will know that we do not clobber registers for the optimized
> case.

Yes, makes sense.

I would still like to see the current proposed fix getting applied and
we can do this on-top. The inline call after this patch reassembles a
direct function call, so besides the long list of clobbers, it should
still be pretty fast.

Thanks,
Hannes


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists