[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1285828299.5211.806.camel@edumazet-laptop>
Date: Thu, 30 Sep 2010 08:31:39 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Changli Gao <xiaosuo@...il.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>,
"Pekka Savola (ipv6)" <pekkas@...core.fi>,
James Morris <jmorris@...ei.org>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
Patrick McHardy <kaber@...sh.net>, netdev@...r.kernel.org
Subject: Re: [PATCH] net: code cleanups
Le jeudi 30 septembre 2010 à 14:09 +0800, Changli Gao a écrit :
> On Thu, Sep 30, 2010 at 1:16 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> > Le jeudi 30 septembre 2010 à 10:24 +0800, Changli Gao a écrit :
> >> Compare operations are more readable, and compilers generate the same code
> >> for the both.
> >>
> >
> > You have a buggy compiler then.
>
> gcc version 4.4.3 (Gentoo 4.4.3-r2 p1.2)
>
> rth = rcu_dereference(rth->dst.rt_next)) {
> if ((((__force u32)rth->fl.fl4_dst ^ (__force u32)daddr) |
> ((__force u32)rth->fl.fl4_src ^ (__force u32)saddr) |
> (rth->fl.iif ^ iif) |
> 2f12: 44 3b 80 dc 00 00 00 cmp 0xdc(%rax),%r8d
> 2f19: 0f 85 a2 00 00 00 jne 2fc1 <ip_route_input_common+0x145
> >
> rth->fl.oif |
> 2f1f: 83 b8 d8 00 00 00 00 cmpl $0x0,0xd8(%rax)
> 2f26: 0f 85 95 00 00 00 jne 2fc1 <ip_route_input_common+0x145
> >
> tos &= IPTOS_RT_MASK;
> hash = rt_hash(daddr, saddr, iif, rt_genid(net));
>
> for (rth = rcu_dereference(rt_hash_table[hash].chain); rth;
> rth = rcu_dereference(rth->dst.rt_next)) {
> if ((((__force u32)rth->fl.fl4_dst ^ (__force u32)daddr) |
> 2f2c: 44 3b b8 e4 00 00 00 cmp 0xe4(%rax),%r15d
> 2f33: 0f 85 88 00 00 00 jne 2fc1
> <ip_route_input_common+0x145>
> ((__force u32)rth->fl.fl4_src ^ (__force u32)saddr) |
> 2f39: 44 3b b0 e8 00 00 00 cmp 0xe8(%rax),%r14d
> 2f40: 75 7f jne 2fc1
> <ip_route_input_common+0x145>
>
>
> >
> > I know this code is ugly, but please keep it as is, dont add conditional
> > branches on hot paths.
> >
>
> If the compiler doesn't generate conditional branches, we have to
> touch every necessary field of all the cache entries in one hash
> bucket. Is it better than condition branch? I think the compiler
> developers know it better.
Last famous words.
Are you aware of cache lines (64 bytes at least on typical cpus), and
that all fields are already in CPU L1 cache ? I (and others) worked hard
in the past.
>
> And the compiler reorders the conditional branches, is it expected?
>
Your compiler added conditional branches on a code not wanting them,
only because on _your_ cpu, these conditional branches might be cheap.
Now, try to compile for an i686 target and see the difference.
If there was no difference, your compiler would be _buggy_, because not
generating optimal assembly.
Here I get :
c141dda9: 8b 55 e8 mov -0x18(%ebp),%edx
c141ddac: 8b 81 9c 00 00 00 mov 0x9c(%ecx),%eax
c141ddb2: 33 91 a0 00 00 00 xor 0xa0(%ecx),%edx
c141ddb8: 31 f0 xor %esi,%eax
c141ddba: 09 d0 or %edx,%eax
c141ddbc: 8b 55 e0 mov -0x20(%ebp),%edx
c141ddbf: 33 91 94 00 00 00 xor 0x94(%ecx),%edx
c141ddc5: 09 d0 or %edx,%eax
c141ddc7: 0f b6 55 e7 movzbl -0x19(%ebp),%edx
c141ddcb: 0b 81 90 00 00 00 or 0x90(%ecx),%eax
c141ddd1: 32 91 a4 00 00 00 xor 0xa4(%ecx),%dl
c141ddd7: 0f b6 d2 movzbl %dl,%edx
c141ddda: 09 d0 or %edx,%eax
c141dddc: 0f 85 9d 00 00 00 jne c141de7f <ip_route_input_common+0x1b4>
As you can see, only one conditional branch.
Your patch is not welcomed, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists