[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0804181810500.6108@twinlark.arctic.org>
Date: Fri, 18 Apr 2008 18:11:22 -0700 (PDT)
From: dean gaudet <dean@...tic.org>
To: Harvey Harrison <harvey.harrison@...il.com>
cc: Joe Perches <joe@...ches.com>,
Alexander van Heukelum <heukelum@...lshack.com>,
Alexander van Heukelum <heukelum@...tmail.fm>,
Ingo Molnar <mingo@...e.hu>, Andi Kleen <andi@...stfloor.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: Alternative implementation of the generic __ffs
On Fri, 18 Apr 2008, Harvey Harrison wrote:
> On Fri, 2008-04-18 at 17:58 -0700, Joe Perches wrote:
> > On Fri, 2008-04-18 at 17:20 -0700, dean gaudet wrote:
> > > any reasonable compiler should figure out the two are the same... but i
> > > really prefer spelling out the lack of dependencies of the computations by
> > > breaking it out per-bit.
> >
> > It seems gcc 4.3 (-Os or -O2) isn't a reasonable compiler.
> >
> > I think this might be best:
> >
> > int ffs32(unsigned int value)
> > {
> > int x;
> >
> > value &= -value;
> > if (!(value & 0x55555555))
> > x = 1;
> > else
> > x = 0;
> > if (!(value & 0x33333333))
> > x |= 2;
> > if (!(value & 0x0f0f0f0f))
> > x |= 4;
> > if (!(value & 0x00ff00ff))
> > x |= 8;
> > if (!(value & 0x0000ffff))
> > x |= 16;
> >
> > return x;
> > }
> >
>
> That produces the shortest assembly for me, also uses the fewest
> registers.
unfortunately it kind of defeats the purpose of the original code... which
is high parallelism / no-dependencies.
have you benchmarked it?
-dean
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists