lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 10 Apr 2008 03:56:18 -0700 (PDT)
From:	David Miller <davem@...emloft.net>
To:	yoshfuji@...ux-ipv6.org
Cc:	shemminger@...tta.com, dada1@...mosbay.com, netdev@...r.kernel.org
Subject: Re: [PATCH 3/6] IPV4 : use xor rather than multiple ands for route
 compare

From: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@...ux-ipv6.org>
Date: Thu, 10 Apr 2008 18:01:48 +0900 (JST)

> In article <20080410.015118.103465510.davem@...emloft.net> (at Thu, 10 Apr 2008 01:51:18 -0700 (PDT)), David Miller <davem@...emloft.net> says:
> 
> > From: Stephen Hemminger <shemminger@...tta.com>
> > Date: Tue, 1 Apr 2008 13:08:42 -0700
> > 
> > > The flow fields are all together, and the other parameters are local variables
> > > in registers so that compare should be in one cache line.
> > > 
> > > --- a/net/ipv4/route.c	2008-03-31 17:12:30.000000000 -0700
> > > +++ b/net/ipv4/route.c	2008-04-01 13:05:46.000000000 -0700
> > > @@ -2079,12 +2079,12 @@ int ip_route_input(struct sk_buff *skb, 
> > >  	rcu_read_lock();
> > >  	for (rth = rcu_dereference(rt_hash_table[hash].chain); rth;
> > >  	     rth = rcu_dereference(rth->u.dst.rt_next)) {
> > > -		if (rth->fl.fl4_dst == daddr &&
> > > -		    rth->fl.fl4_src == saddr &&
> > > -		    rth->fl.iif == iif &&
> > > -		    rth->fl.oif == 0 &&
> > > +		if (((rth->fl.fl4_dst ^ daddr) |
> > > +		     (rth->fl.fl4_src ^ saddr) |
> > > +		     (rth->fl.iif ^ iif) |
> > > +		     rth->fl.oif |
> > > +		     (rth->fl.fl4_tos ^ tos)) == 0 &&
> > >  		    rth->fl.mark == skb->mark &&
> > > -		    rth->fl.fl4_tos == tos &&
> > >  		    net_eq(dev_net(rth->u.dst.dev), net) &&
> > >  		    rth->rt_genid == atomic_read(&rt_genid)) {
> > >  			dst_use(&rth->u.dst, jiffies);
> > 
> > Eric, any objections to this version?
> 
> I'm not Eric, but well, I'm now doubting if this is really good.
> If the comparision chain is long and it is unlikely to pass all the tests,
> it would be better to cut the line.
> If we use "or", we need to run through the test, in ayn case.

Actually the case you mention it is part of the incentive for
this change.

Branch prediction fares very poorly in such cases, and
therefore it is better to mispredict one branch over
all the data items in the same cache line than any one
of several such branches.  The above new sequence gets
emitted by the compiler as several integer operations and
one branch.  As long as all the data items are in the
same cacheline, this is optimal.

We made such a change for ethernet address comparisons a
few years ago.  At the time Eric showed that it mattered
a lot for Athlon processors.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ