lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150911230554.GA26655@linux.vnet.ibm.com>
Date:	Fri, 11 Sep 2015 16:05:54 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	NeilBrown <neilb@...e.de>,
	Patrick Marlier <patrick.marlier@...il.com>,
	linux-kernel@...r.kernel.org, mingo@...nel.org,
	laijs@...fujitsu.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
	josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org,
	dhowells@...hat.com, edumazet@...gle.com, dvhart@...ux.intel.com,
	fweisbec@...il.com, oleg@...hat.com, bobby.prani@...il.com,
	wangyun@...ux.vnet.ibm.com
Subject: Re: [PATCH tip/core/rcu 3/4] md/bitmap: Fix list_entry_rcu usage

On Tue, May 19, 2015 at 03:07:25PM -0700, Paul E. McKenney wrote:
> On Mon, May 18, 2015 at 09:43:21AM -0400, Steven Rostedt wrote:
> > On Mon, 18 May 2015 12:06:47 +1000
> > NeilBrown <neilb@...e.de> wrote:
> > 
> > 
> > > > struct mddev {
> > > > ...
> > > > 	struct list_head                disks;
> > > > ...}
> > > > 
> > > > struct list_head {
> > > >          struct list_head *next, *prev;
> > > > };
> > > > 
> > > > The tricky thing is that "list_entry_rcu" before and after the patch is 
> > > > reading the same thing.
> > > 
> > > No it isn't.
> > > Before the patch it is passed the address of the 'next' field.  After the
> > > patch it is passed the contents of the 'next' field.
> > 
> > Right.
> > 
> > > 
> > > 
> > > > 
> > > > However in your case, the change I proposed is probably wrong I trust 
> > > > you on this side. :) What's your proposal to fix it with the rculist patch?
> > > 
> > > What needs fixing?  I don't see anything broken.
> > > 
> > > Maybe there is something in this "rculist patch" that I'm missing.  Can you
> > > point me at it?
> > > 
> > 
> > Probably some debugging tool like sparse notices that the assignment
> > isn't a true list entry and complains about it. In other words, I think
> > the real fix is to fix the debugging tool to ignore this, because the
> > code is correct, and this is a false positive failure, and is causing
> > more harm than good, because people are sending out broken patches due
> > to it.
> 
> OK, finally did the history trawling that I should have done to begin with.
> 
> Back in 2010, Arnd added the __rcu pointer checking in sparse.
> But the RCU list primitives were used on non-RCU-protected lists, so
> some casting pain was required to avoid sparse complaints.  (Keep in
> mind that the list_head structure does not mark ->next with __rcu.)
> Arnd's workaround was to copy the pointer to the local stack, casting
> it to an __rcu pointer, then use rcu_dereference_raw() to do the needed
> traversal of an RCU-protected pointer.
> 
> This of course resulted in an extraneous load from the stack, which
> Patrick noticed in his performance work, and which motivated him to send
> the patches.
> 
> Perhaps what I should do is create an rcu_dereference_nocheck() for use
> in list traversals, that omits the sparse checking.  That should get rid
> of both the sparse warnings and the strange casts.
> 
> The code in md probably needs to change in any case, as otherwise we are
> invoking rcu_dereference_whatever() on a full struct list_head rather
> than on a single pointer.  Or am I missing something here?

Finally getting back to this one...

I switched to lockless_dereference() instead of rcu_dereference_raw(),
and am running it through the testing gamut.  Patrick, are you OK with
this change?

							Thanx, Paul

------------------------------------------------------------------------

    rculist: Make list_entry_rcu() use lockless_dereference()
    
    The current list_entry_rcu() implementation copies the pointer to a stack
    variable, then invokes rcu_dereference_raw() on it.  This results in an
    additional store-load pair.  Now, most compilers will emit normal store
    and load instructions, which might seem to be of negligible overhead,
    but this results in a load-hit-store situation that can cause surprisingly
    long pipeline stalls, even on modern microprocessors.  The problem is
    that it takes time for the store to get the store buffer updated, which
    can delay the subsequent load, which immediately follows.
    
    This commit therefore switches to the lockless_dereference() primitive,
    which does not expect the __rcu annotations (that are anyway not present
    in the list_head structure) and which, like rcu_dereference_raw(),
    does not check for an enclosing RCU read-side critical section.
    Most importantly, it does not copy the pointer, thus avoiding the
    load-hit-store overhead.
    
    Signed-off-by: Patrick Marlier <patrick.marlier@...il.com>
    [ paulmck: Switched to lockless_dereference() to suppress sparse warnings. ]
    Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 17c6b1f84a77..5ed540986019 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -247,10 +247,7 @@ static inline void list_splice_init_rcu(struct list_head *list,
  * primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
  */
 #define list_entry_rcu(ptr, type, member) \
-({ \
-	typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
-	container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \
-})
+	container_of(lockless_dereference(ptr), type, member)
 
 /**
  * Where are list_empty_rcu() and list_first_entry_rcu()?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ