[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100920012915.GI3060@linux.vnet.ibm.com>
Date: Sun, 19 Sep 2010 18:29:15 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: James Bottomley <James.Bottomley@...senPartnership.com>
Cc: Miklos Szeredi <miklos@...redi.hu>, benh@...nel.crashing.org,
dhowells@...hat.com, linux-kernel@...r.kernel.org,
linux-arch@...r.kernel.org
Subject: Re: memory barrier question
On Sun, Sep 19, 2010 at 07:58:36PM -0500, James Bottomley wrote:
> On Sun, 2010-09-19 at 14:59 -0700, Paul E. McKenney wrote:
> > On Sun, Sep 19, 2010 at 10:15:51PM +0200, Miklos Szeredi wrote:
> > > On Sun, 19 Sep 2010, Paul E. McKenney wrote:
> > > > Give it a few years. There are reportedly already other compilers that do
> > > > this, which is not too surprising given that the perception of insanity
> > > > is limited to lockless parallel code. If you have single-threaded code,
> > > > such as code and data under a lock (where the data is never accessed
> > > > without holding that lock), then this sort of optimization is pretty safe.
> > > > I still don't like it, but the compiler guys would argue that this is
> > > > because I am one of those insane parallel-programming guys.
> > > >
> > > > Furthermore, there are other ways to get into trouble. If the code
> > > > continued as follows:
> > > >
> > > > LOAD inode = next.dentry->inode
> > > > if (inode != NULL)
> > > > LOAD inode->f_op
> > > > do_something_using_lots_of_registers();
> > > > LOAD inode->some_other_field
> > > >
> > > > and if the code expected ->f_op and ->some_other_field to be from the
> > > > same inode structure, severe disappointment could ensue. This is because
> > > > the compiler is within its rights to reload from next.dentry->inode,
> > > > especially given register pressure. In fact, the compiler would be within
> > > > its rights to reload from next.dentry->inode in the "LOAD inode->f_op"
> > > > statement. And it might well get NULL from such a reload.
> > >
> > > Except the VFS doesn't allow that. dentry->inode can go from NULL to
> > > non-NULL anytime but will only go from non-NULL to NULL when there are
> > > no possible external references to the dentry.
> > >
> > > The compiler and the CPU cannot move the "LOAD inode->some_field"
> > > before the "LOAD dentry->inode", because of the conditional, right?
> >
> > Other than Alpha, the CPU cannot. The standard -does- permit the
> > compiler to guess the value of the pointer, thus effectively moving the
> > load prior to the conditional. At present, as far as I know, gcc does
> > not actually do this.
>
> I believe the standard only allows this if the compiler can *prove* (not
> guess) the result of the if test. Even on Parisc where this can be
> expressed as a nullified instruction pair, the architectural rules
> forbid speculation of the latter before the former (the only time they
> can be executed out of order is when the CPU can prove the values
> internally, i.e. they're both cached)
Agreed, but unless you tell it otherwise, the compiler is allowed to
assume that the code it is generating is the only thread running in
the address space in question. That is most definitely not the case
here, so we might need to let the compiler in on this information.
> > Again, please put at least an ACCESS_ONCE() in. Trivial to do now,
> > possibly saving much pain and headache later on.
>
> OK, lost you here. ACCESS_ONCE() is only needed in certain situations
> (like list traversal) because some compilers can reload cached values
> across an explicit barrier (which isn't here).
ACCESS_ONCE() also tells the compiler not to try to guess.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists