[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46C55E90.7010407@yahoo.com.au>
Date: Fri, 17 Aug 2007 18:38:40 +1000
From: Nick Piggin <nickpiggin@...oo.com.au>
To: Satyam Sharma <satyam@...radead.org>
CC: Herbert Xu <herbert@...dor.apana.org.au>,
Paul Mackerras <paulus@...ba.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Christoph Lameter <clameter@....com>,
Chris Snook <csnook@...hat.com>,
Ilpo Jarvinen <ilpo.jarvinen@...sinki.fi>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Stefan Richter <stefanr@...6.in-berlin.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-arch@...r.kernel.org, Netdev <netdev@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>, ak@...e.de,
heiko.carstens@...ibm.com, David Miller <davem@...emloft.net>,
schwidefsky@...ibm.com, wensong@...ux-vs.org, horms@...ge.net.au,
wjiang@...ilience.com, cfriesen@...tel.com, zlynx@....org,
rpjday@...dspring.com, jesper.juhl@...il.com,
segher@...nel.crashing.org
Subject: Re: [PATCH 0/24] make atomic_read() behave consistently across all
architectures
Satyam Sharma wrote:
>
> On Fri, 17 Aug 2007, Herbert Xu wrote:
>
>
>>On Fri, Aug 17, 2007 at 01:43:27PM +1000, Paul Mackerras wrote:
>>
>>BTW, the sort of missing barriers that triggered this thread
>>aren't that subtle. It'll result in a simple lock-up if the
>>loop condition holds upon entry. At which point it's fairly
>>straightforward to find the culprit.
>
>
> Not necessarily. A barrier-less buggy code such as below:
>
> atomic_set(&v, 0);
>
> ... /* some initial code */
>
> while (atomic_read(&v))
> ;
>
> ... /* code that MUST NOT be executed unless v becomes non-zero */
>
> (where v->counter is has no volatile access semantics)
>
> could be generated by the compiler to simply *elid* or *do away* with
> the loop itself, thereby making the:
>
> "/* code that MUST NOT be executed unless v becomes non-zero */"
>
> to be executed even when v is zero! That is subtle indeed, and causes
> no hard lockups.
Then I presume you mean
while (!atomic_read(&v))
;
Which is just the same old infinite loop bug solved with cpu_relax().
These are pretty trivial to audit and fix, and also to debug, I would
think.
> Granted, the above IS buggy code. But, the stated objective is to avoid
> heisenbugs.
Anyway, why are you making up code snippets that are buggy in other
ways in order to support this assertion being made that lots of kernel
code supposedly depends on volatile semantics. Just reference the
actual code.
> And we have driver / subsystem maintainers such as Stefan
> coming up and admitting that often a lot of code that's written to use
> atomic_read() does assume the read will not be elided by the compiler.
So these are broken on i386 and x86-64?
Are they definitely safe on SMP and weakly ordered machines with
just a simple compiler barrier there? Because I would not be
surprised if there are a lot of developers who don't really know
what to assume when it comes to memory ordering issues.
This is not a dig at driver writers: we still have memory ordering
problems in the VM too (and probably most of the subtle bugs in
lockless VM code are memory ordering ones). Let's not make up a
false sense of security and hope that sprinkling volatile around
will allow people to write bug-free lockless code. If a writer
can't be bothered reading API documentation and learning the Linux
memory model, they can still be productive writing safely locked
code.
--
SUSE Labs, Novell Inc.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists