netdev - RE: [PATCH net-next] net: sched: use no more than one page in struct fw

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <063D6719AE5E284EB5DD2968C1650D6D0F6DF4C4@AcuExch.aculab.com>
Date:	Mon, 17 Mar 2014 14:29:27 +0000
From:	David Laight <David.Laight@...LAB.COM>
To:	'Eric Dumazet' <eric.dumazet@...il.com>,
	Thomas Graf <tgraf@...g.ch>
CC:	David Miller <davem@...emloft.net>,
	John Fastabend <john.fastabend@...il.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH net-next] net: sched: use no more than one page in
 struct fw_head

From: Eric Dumazet
> On Mon, 2014-03-17 at 13:51 +0000, Thomas Graf wrote:
> > On 03/16/14 at 09:06am, Eric Dumazet wrote:
> > > From: Eric Dumazet <edumazet@...gle.com>
> > >
> > > In commit b4e9b520ca5d ("[NET_SCHED]: Add mask support to fwmark
> > > classifier") Patrick added an u32 field in fw_head, making it slightly
> > > bigger than one page.
> > >
> > > Change the layout of this structure and let compiler emit a reciprocal
> > > divide for fw_hash(), as this makes the core more readable and
> > > more efficient those days.
> >
> > I think you  need to educate me a bit on this. objdump
> > spits out the following:
> >
> > static u32 fw_hash(u32 handle)
> > {
> >         return handle % HTSIZE;
> >   1d:   bf ff 01 00 00          mov    edi,0x1ff
> >   22:   89 f0                   mov    eax,esi
> >   24:   31 d2                   xor    edx,edx
> >   26:   f7 f7                   div    edi
> >
> > Doesn't look like a reciprocal div to me. Where did I
> > screw up or why doesn't gcc optimize it properly?
> > --
> 
> Thats because on your cpu, gcc knows the divide is cheaper than anything
> else (a multiply followed by a shift)

Especially for a modulus operation - which requires a second multiply
and probably has issues with some large values.

For a hash you could use '(handle * 0x1ffull) >> 32' to reduce the hash.

or use a 'modulo 2^n-1 reduction':
	handle = (handle & 0x3ffff) + (handle >> 18);
	do
		handle = (handle & 0x1ff) + (handle >> 9);
	while (handle > 0x1ffu);

	David