lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 5 Sep 2019 13:55:56 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Oleg Nesterov <oleg@...hat.com>,
        Russell King - ARM Linux admin <linux@...linux.org.uk>,
        Chris Metcalf <cmetcalf@...hip.com>,
        Christoph Lameter <cl@...ux.com>,
        Kirill Tkhai <tkhai@...dex.ru>, Mike Galbraith <efault@....de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Davidlohr Bueso <dave@...olabs.net>, mpe@...erman.id.au
Subject: Re: [PATCH 2/3] task: RCU protect tasks on the runqueue

On Thu, Sep 05, 2019 at 03:02:49PM -0500, Eric W. Biederman wrote:
> "Paul E. McKenney" <paulmck@...nel.org> writes:
> 
> > On Tue, Sep 03, 2019 at 10:06:03PM +0200, Peter Zijlstra wrote:
> >> On Tue, Sep 03, 2019 at 12:18:47PM -0700, Linus Torvalds wrote:
> >> > Now, if you can point to some particular field where that ordering
> >> > makes sense for the particular case of "make it active on the
> >> > runqueue" vs "look up the task from the runqueue using RCU", then I do
> >> > think that the whole release->acquire consistency makes sense.
> >> > 
> >> > But it's not clear that such a field exists, particularly when this is
> >> > in no way the *common* way to even get a task pointer, and other paths
> >> > do *not* use the runqueue as the serialization point.
> >> 
> >> Even if we could find a case (and I'm not seeing one in a hurry), I
> >> would try really hard to avoid adding extra barriers here and instead
> >> make the consumer a little more complicated if at all possible.
> >> 
> >> The Power folks got rid of a SYNC (yes, more expensive than LWSYNC) from
> >> their __switch_to() implementation and that had a measurable impact.
> >> 
> >> 9145effd626d ("powerpc/64: Drop explicit hwsync in context switch")
> >
> > The patch [1] looks good to me.  And yes, if the structure pointed to by
> > the second argument of rcu_assign_pointer() is already visible to readers,
> > it is OK to instead use RCU_INIT_POINTER().  Yes, this loses ordering.
> > But weren't these simple assignments before RCU got involved?
> >
> > As a very rough rule of thumb, LWSYNC is about twice as fast as SYNC.
> > (Depends on workload, exact details of the hardware, timing, phase of
> > the moon, you name it.)  So removing the LWSYNC is likely to provide
> > measureable benefit, but I must defer to the powerpc maintainers.
> > To that end, I added Michael on CC.
> >
> > [1] https://lore.kernel.org/lkml/878sr6t21a.fsf_-_@x220.int.ebiederm.org/
> 
> Paul, what is the purpose of the barrier in rcu_assign_pointer?
> 
> My intuition says it is the assignment half of rcu_dereference, and that
> anything that rcu_dereference does not need is too strong.
> 
> My basic understanding is that only alpha ever has the memory ordering
> issue that rcu_dereference deals with.  So I am a bit surprised that
> this is anything other than a noop for anything except alpha.

Yes, only Alpha needs an actual memory-barrier instruction in
rcu_dereference().  And it is the only one that gets one.

> In my patch I used rcu_assign_pointer because that is the canonically
> correct way to do things.  Peter makes a good case that adding an extra
> barrier in ___schedule could be detrimental to system performance.
> At the same time if there is a correctness issue on alpha that we have
> been overlooking because of low testing volume on alpha I don't want to
> just let this slide and have very subtle bugs.

But rcu_assign_pointer() needs a memory-barrier instruction on the
weakly ordered architectures: ARM, powerpc, Itanium, and so on.

> The practical concern is that people have been really wanting to do
> lockless and rcu operations on tasks in the runqueue for a while and
> there are several very clever pieces of code doing that now.  By
> changing the location of the rcu put I am trying to make these uses
> ordinary rcu uses.
> 
> The uses in question are the pieces of code I update in:
> https://lore.kernel.org/lkml/8736het20c.fsf_-_@x220.int.ebiederm.org/

The rcu_assign_pointer() at the end of rcuwait_wait_event(), right?

> In short.  Why is rcu_assign_pointer() more than WRITE_ONCE() on
> anything but alpha?  Do other architectures need more?  Is the trade off
> worth it if we avoid using rcu_assign_pointer on performance critical
> paths.

Note the difference between the read-side rcu_dereference(), which
does not require any memory-barrier instructions except on Alpha,
and the update-side rcu_assign_pointer() which does require a
memory-barrier instruction on quite a few architectures.  Even on
the strongly ordered architectures (x86, s390, ...) a compiler
barrier() is required for rcu_assign_pointer().

Except note the exceptional cases where RCU_INIT_POINTER() may be
used in place of rcu_assign_pointer(), which are called out in
RCU_INIT_POINTER()'s header comment:

 * Initialize an RCU-protected pointer in special cases where readers
 * do not need ordering constraints on the CPU or the compiler.  These
 * special cases are:
 *
 * 1.	This use of RCU_INIT_POINTER() is NULLing out the pointer *or*
 * 2.	The caller has taken whatever steps are required to prevent
 *	RCU readers from concurrently accessing this pointer *or*
 * 3.	The referenced data structure has already been exposed to
 *	readers either at compile time or via rcu_assign_pointer() *and*
 *
 *	a.	You have not made *any* reader-visible changes to
 *		this structure since then *or*
 *	b.	It is OK for readers accessing this structure from its
 *		new location to see the old state of the structure.  (For
 *		example, the changes were to statistical counters or to
 *		other state where exact synchronization is not required.)
 *
 * Failure to follow these rules governing use of RCU_INIT_POINTER() will
 * result in impossible-to-diagnose memory corruption.  As in the structures
 * will look OK in crash dumps, but any concurrent RCU readers might
 * see pre-initialized values of the referenced data structure.  So
 * please be very careful how you use RCU_INIT_POINTER()!!!

If current is already visible (which it should be unless
rcuwait_wait_event() is invoked at task-creation time), then
RCU_INIT_POINTER() could be used in rcuwait_wait_event().

> Eric
> 
> p.s. I am being slow at working through all of this as I am dealing
>      with my young baby son, and busy packing for the conference.

Congratulations on the baby son!!!  I remember those times well, but
they were more than three decades ago for my oldest.  ;-)

>      I might not be able to get back to this discussion until after
>      I have landed in Lisbon on Saturday night.

Looking forward to it!

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ