[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <063D6719AE5E284EB5DD2968C1650D6D0F6DF4C4@AcuExch.aculab.com>
Date: Mon, 17 Mar 2014 14:29:27 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Eric Dumazet' <eric.dumazet@...il.com>,
Thomas Graf <tgraf@...g.ch>
CC: David Miller <davem@...emloft.net>,
John Fastabend <john.fastabend@...il.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH net-next] net: sched: use no more than one page in
struct fw_head
From: Eric Dumazet
> On Mon, 2014-03-17 at 13:51 +0000, Thomas Graf wrote:
> > On 03/16/14 at 09:06am, Eric Dumazet wrote:
> > > From: Eric Dumazet <edumazet@...gle.com>
> > >
> > > In commit b4e9b520ca5d ("[NET_SCHED]: Add mask support to fwmark
> > > classifier") Patrick added an u32 field in fw_head, making it slightly
> > > bigger than one page.
> > >
> > > Change the layout of this structure and let compiler emit a reciprocal
> > > divide for fw_hash(), as this makes the core more readable and
> > > more efficient those days.
> >
> > I think you need to educate me a bit on this. objdump
> > spits out the following:
> >
> > static u32 fw_hash(u32 handle)
> > {
> > return handle % HTSIZE;
> > 1d: bf ff 01 00 00 mov edi,0x1ff
> > 22: 89 f0 mov eax,esi
> > 24: 31 d2 xor edx,edx
> > 26: f7 f7 div edi
> >
> > Doesn't look like a reciprocal div to me. Where did I
> > screw up or why doesn't gcc optimize it properly?
> > --
>
> Thats because on your cpu, gcc knows the divide is cheaper than anything
> else (a multiply followed by a shift)
Especially for a modulus operation - which requires a second multiply
and probably has issues with some large values.
For a hash you could use '(handle * 0x1ffull) >> 32' to reduce the hash.
or use a 'modulo 2^n-1 reduction':
handle = (handle & 0x3ffff) + (handle >> 18);
do
handle = (handle & 0x1ff) + (handle >> 9);
while (handle > 0x1ffu);
David
Powered by blists - more mailing lists