lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 23 Jul 2007 14:43:58 +0530
From:	"Satyam Sharma" <satyam.sharma@...il.com>
To:	"Nick Piggin" <nickpiggin@...oo.com.au>
Cc:	"Linus Torvalds" <torvalds@...ux-foundation.org>,
	"Andrew Morton" <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] AFS: Fix file locking

[ Restricting discussion to the i386 bitops implementation. ]

Hi Nick,

On 7/23/07, Satyam Sharma <satyam.sharma@...il.com> wrote:
> Hi,
>
> On 7/23/07, Nick Piggin <nickpiggin@...oo.com.au> wrote:
> > Linus Torvalds wrote:
> > >
> > > On Fri, 20 Jul 2007, Nick Piggin wrote:
> > >
> > >>So you did. Then to answer that, yes it could be faster because there are
> > >>stupid volatiles sprinkled all over the bitops code so you could easily
> > >>end up having to do more loads. Does it make a real difference? Unlikely,
> > >>but David loves counting cycles :)
> > >
> > >
> > > I thought we long long since removed the volatiles. They are buggy and
> > > horrible, and we really want to let the compiler combine multiple
> > > test-bits, and if they matter that implies locking is buggy or something
> > > worse..
> > >
> > > Ie we'd *want*
> > >
> > >       if (test_bit(x, y) || test_bit(z,y))
> > >
> > > to be rewritten by the compiler as testing bits x/z at the same time.
> >
> > Yep. We'd also want __set_bit(x, y); __set_bit(z, y); and such to be
> > combined.

BTW I'm also running some tests writing test code, compiling and verifying
the code gcc generates ... curiously, volatile-access-casting of the passed
bit-string address is not the only thing that'll prevent gcc's optimizer from
combining the operations such as ones you listed above. Then there are
-O2 vs -Os (and constant_test_bit() vs variable_test_bit()) differences I am
observing ... and sometimes just the inadequacy of gcc's optimizer -- note
that constant_test_bit() seems to go through extra hoops unnecessarily to
avoid honouring @nr >= 32, whereas none of the other primitives in that file
does that. So the i386 kernel's stock constant_test_bit implementation
ends up differing from David's open-coded versions quite drastically in
subtle ways, and again makes it difficult to combine the kind of operations
you guys are discussing here ...

It's a given, of course, that the code that gcc generates when combining
such operations would clearly be more optimal than the simple btl-sbbl pair
with test-and-conditional-jumps that would otherwise get generated ...

> > > But now I'm too scared to look.
> >
> > Not a chance :) Even the asm-generic "reference" implementation ratifies
> > the volatile crapiness. Would you take a patch?
>
> Coincidentally, I'm working on a cleanup of the bitops code just now --
> I stumbled upon a lot of varied bogosity in there :-)

Such as bogus/invalid asm constraints being passed in the inline assembly.
Probably gcc knows everybody gets its complicated extended asm wrong,
so doesn't barf when parsing such stuff ... :-)

> Intend to send it
> out in a couple of hours, probably.

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ