[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160302013759.GA362@dvhart-mobl5.amr.corp.intel.com>
Date: Tue, 1 Mar 2016 17:37:59 -0800
From: Darren Hart <dvhart@...radead.org>
To: Jianyu Zhan <nasa4836@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>, dave@...olabs.net,
Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...nel.org>, dvhart@...ux.intel.com,
bigeasy@...utronix.de, Paul McKenney <paulmck@...ux.vnet.ibm.com>
Subject: Re: [PATCH] futex: replace bare barrier() with a READ_ONCE()
On Tue, Mar 01, 2016 at 08:44:48AM +0800, Jianyu Zhan wrote:
> On Tue, Mar 1, 2016 at 6:22 AM, Darren Hart <dvhart@...radead.org> wrote:
> > You've mention it causes problems a few times, but you do not specify what
> > problem it causes or how it manifests.
> >
> > Is this a theoretical bug, or have you experienced a failure case. If so, how
> > did this manifest? Do you have a reproducer we could add to the futex testsuite
> > in the kernel selftests?
>
>
> 1. For the first problem I memtioned, it is a bug that described in
> commit e91467ecd1ef.
>
> The scenario is like:
>
> lock_ptr = q->lock_ptr
> if (lock_ptr != 0) spin_lock(lock_ptr)
>
> and the compiler generates code that is equivalent to :
>
> if (q->lock_ptr != 0) spin_lock(q->lock_ptr)
>
> The problem is q->lock_ptr can change between the test of nullness of
> q->lock_ptr and the lock on q->lock_ptr
> So a barrier() is inserted into the load of q->lock_ptr and the test
> of nullness, to avoid the pointer aliasing.
>
> Apparently, a READ_ONCE() fits the goal here.
read_once will use a *volatile assignment instead of calling barrier()
directly for a word size argument.
With weak statements like "apparently" (above) and "could be" (from the original
post: This patch replaces this bare barrier() with a READ_ONCE(), a weaker form
of barrier(), which could be more informative.)" I do not see a compelling
reason to change what is notoriously fragile code with respect to barriers.
As for #2...
> 2. For the second problem I memtioned, yes, it is theoretical, and
> it is also due to q->lock_ptr can change between
> the test of nullness of q->lock_ptr and the lock on q->lock_ptr.
>
> the code is
>
> retry:
> lock_ptr = q->lock_ptr;
> if (lock_ptr != 0) {
> spin_lock(lock_ptr)
> if (unlikely(lock_ptr != q->lock_ptr)) {
> spin_unlock(lock_ptr);
> goto retry;
> }
> ...
> }
>
> which is effectively the same as :
>
> while (lock_ptr = q->lock_ptr) {
> spin_lock(lock_ptr)
> if (unlikely(lock_ptr != q->lock_ptr)) {
> spin_unlock(lock_ptr);
> goto retry;
> }
> ...
> }
>
> This might cause the compiler load the q->lock_ptr once and use it
> many times, quoted from
> memory-barriers.txt:
>
>
> (*) The compiler is within its rights to reload a variable, for example,
> in cases where high register pressure prevents the compiler from
> keeping all data of interest in registers. The compiler might
> therefore optimize the variable 'tmp' out of our previous example:
>
> while (tmp = a)
> do_something_with(tmp);
>
> This could result in the following code, which is perfectly safe in
> single-threaded code, but can be fatal in concurrent code:
>
> while (a)
> do_something_with(a);
>
> For example, the optimized version of this code could result in
> passing a zero to do_something_with() in the case where the variable
> a was modified by some other CPU between the "while" statement and
> the call to do_something_with().
>
> Again, use READ_ONCE() to prevent the compiler from doing this:
>
> while (tmp = READ_ONCE(a))
> do_something_with(tmp);
>
OK, so this is really the meat of the argument for the patch. You are looking to
add a compiler barrier instead of a CPU memory barrier. This should be what your
commit message is focused on and it should provide compelling evidence to
justify risking the change.
Compelling evidence for a compiler barrier would be a disassembly of the code
block before and after, demonstrating the compiler generating broken code and the
compiler generating correct code.
In addition to this, however, I would want to see a strong convincing argument
that the READ_ONCE volatile cast is sufficient to cover the original case which
motivated the addition of the barrier() (not "apparently" and "could be").
Thanks,
--
Darren Hart
Intel Open Source Technology Center
Powered by blists - more mailing lists