[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a781481a0707230213n7672223duf8008926d53012bc@mail.gmail.com>
Date: Mon, 23 Jul 2007 14:43:58 +0530
From: "Satyam Sharma" <satyam.sharma@...il.com>
To: "Nick Piggin" <nickpiggin@...oo.com.au>
Cc: "Linus Torvalds" <torvalds@...ux-foundation.org>,
"Andrew Morton" <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] AFS: Fix file locking
[ Restricting discussion to the i386 bitops implementation. ]
Hi Nick,
On 7/23/07, Satyam Sharma <satyam.sharma@...il.com> wrote:
> Hi,
>
> On 7/23/07, Nick Piggin <nickpiggin@...oo.com.au> wrote:
> > Linus Torvalds wrote:
> > >
> > > On Fri, 20 Jul 2007, Nick Piggin wrote:
> > >
> > >>So you did. Then to answer that, yes it could be faster because there are
> > >>stupid volatiles sprinkled all over the bitops code so you could easily
> > >>end up having to do more loads. Does it make a real difference? Unlikely,
> > >>but David loves counting cycles :)
> > >
> > >
> > > I thought we long long since removed the volatiles. They are buggy and
> > > horrible, and we really want to let the compiler combine multiple
> > > test-bits, and if they matter that implies locking is buggy or something
> > > worse..
> > >
> > > Ie we'd *want*
> > >
> > > if (test_bit(x, y) || test_bit(z,y))
> > >
> > > to be rewritten by the compiler as testing bits x/z at the same time.
> >
> > Yep. We'd also want __set_bit(x, y); __set_bit(z, y); and such to be
> > combined.
BTW I'm also running some tests writing test code, compiling and verifying
the code gcc generates ... curiously, volatile-access-casting of the passed
bit-string address is not the only thing that'll prevent gcc's optimizer from
combining the operations such as ones you listed above. Then there are
-O2 vs -Os (and constant_test_bit() vs variable_test_bit()) differences I am
observing ... and sometimes just the inadequacy of gcc's optimizer -- note
that constant_test_bit() seems to go through extra hoops unnecessarily to
avoid honouring @nr >= 32, whereas none of the other primitives in that file
does that. So the i386 kernel's stock constant_test_bit implementation
ends up differing from David's open-coded versions quite drastically in
subtle ways, and again makes it difficult to combine the kind of operations
you guys are discussing here ...
It's a given, of course, that the code that gcc generates when combining
such operations would clearly be more optimal than the simple btl-sbbl pair
with test-and-conditional-jumps that would otherwise get generated ...
> > > But now I'm too scared to look.
> >
> > Not a chance :) Even the asm-generic "reference" implementation ratifies
> > the volatile crapiness. Would you take a patch?
>
> Coincidentally, I'm working on a cleanup of the bitops code just now --
> I stumbled upon a lot of varied bogosity in there :-)
Such as bogus/invalid asm constraints being passed in the inline assembly.
Probably gcc knows everybody gets its complicated extended asm wrong,
so doesn't barf when parsing such stuff ... :-)
> Intend to send it
> out in a couple of hours, probably.
Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists