[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0802181500410.13406@vixen.sonytel.be>
Date: Mon, 18 Feb 2008 15:01:35 +0100 (CET)
From: Geert Uytterhoeven <Geert.Uytterhoeven@...ycom.com>
To: Adrian Bunk <bunk@...nel.org>
cc: Michael Ellerman <michael@...erman.id.au>,
Roel Kluin <12o3l@...cali.nl>,
lkml <linux-kernel@...r.kernel.org>, cbe-oss-dev@...abs.org,
linuxppc-dev@...abs.org, Willy Tarreau <w@....eu>,
Arjan van de Ven <arjan@...radead.org>
Subject: Re: [PATCH 1/3] Fix Unlikely(x) == y
On Mon, 18 Feb 2008, Adrian Bunk wrote:
> On Sun, Feb 17, 2008 at 10:50:03PM +1100, Michael Ellerman wrote:
> > On Sat, 2008-02-16 at 10:39 -0800, Arjan van de Ven wrote:
> >...
> > > for mordern (last 10 years) x86 cpus... the cpu branchpredictor is better
> > > than the coder in general. Same for most other architectures.
> > >
> > > unlikely() creates bigger code as well.
> > >
> > > Now... we're talking about your super duper hotpath function here right?
> > > One where you care about 0.5 cycle speed improvement? (less on modern
> > > systems ;)
> >
> > The first patch was to platforms/ps3 code, which runs on the Cell, in
> > particular the PPE ... which is not an x86 :)
> >
> > eg:
> >
> > [michael@...oenaich ~]$ cat branch.c
> > #include <stdio.h>
> > int main(void)
> > {
> > int i, j;
> >
> > for (i = 0, j = 0; i < 1000000000; i++)
> > if (i % 4 == 0)
> > j++;
> >
> > printf("j = %d\n", j);
> > return 0;
> > }
> > [michael@...oenaich ~]$ ppu-gcc -Wall -O3 -o branch branch.c
> > [michael@...oenaich ~]$ time ./branch
> > real 0m5.172s
> >
> > [michael@...oenaich ~]$ cat branch.c
> > ..
> > for (i = 0, j = 0; i < 1000000000; i++)
> > if (__builtin_expect(i % 4 == 0, 0))
> > j++;
> > ..
> > [michael@...oenaich ~]$ ppu-gcc -Wall -O3 -o branch branch.c
> > [michael@...oenaich ~]$ time ./branch
> > real 0m3.762s
> >
> >
> > Which looks as though unlikely() is helping. Admittedly we don't have a
> > lot of kernel code that looks like that, but at least unlikely is doing
> > what we want it to.
>
> This means it generates faster code with a current gcc for your platform.
>
> But a future gcc might e.g. replace the whole loop with a division
> (gcc SVN head (that will soon become gcc 4.3) already does
> transformations like replacing loops with divisions [1]).
Hence shouldn't we ask the gcc people what's the purpose of __builtin_expect(),
if it doesn't live up to its promise?
With kind regards,
Geert Uytterhoeven
Software Architect
Sony Network and Software Technology Center Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium
Phone: +32 (0)2 700 8453
Fax: +32 (0)2 700 8622
E-mail: Geert.Uytterhoeven@...ycom.com
Internet: http://www.sony-europe.com/
Sony Network and Software Technology Center Europe
A division of Sony Service Centre (Europe) N.V.
Registered office: Technologielaan 7 · B-1840 Londerzeel · Belgium
VAT BE 0413.825.160 · RPR Brussels
Fortis Bank Zaventem · Swift GEBABEBB08A · IBAN BE39001382358619
Powered by blists - more mailing lists