lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 13 Apr 2017 11:54:20 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     linux-kernel@...r.kernel.org, mingo@...nel.org,
        jiangshanlai@...il.com, dipankar@...ibm.com,
        akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
        josh@...htriplett.org, tglx@...utronix.de, rostedt@...dmis.org,
        dhowells@...hat.com, edumazet@...gle.com, fweisbec@...il.com,
        oleg@...hat.com, bobby.prani@...il.com
Subject: Re: [PATCH tip/core/rcu 40/40] srcu: Parallelize callback handling

On Wed, Apr 12, 2017 at 10:40:25AM -0700, Paul E. McKenney wrote:
> Peter Zijlstra proposed using SRCU to reduce mmap_sem contention [1],
> however, there are workloads that could result in a high volume of
> concurrent invocations of call_srcu(), which with current SRCU would
> result in excessive lock contention on the srcu_struct structure's
> ->queue_lock, which protects SRCU's callback lists.  This commit therefore
> moves SRCU to per-CPU callback lists, thus greatly reducing contention.
> 
> Because a given SRCU instance no longer has a single centralized callback
> list, starting grace periods and invoking callbacks each require a bit
> more work.  These are handled using an srcu_node tree that is in some ways
> similar to the rcu_node trees used by RCU-bh, RCU-preempt, and RCU-sched
> (for example, the srcu_node tree shape is controlled by exactly the
> same Kconfig options and boot parameters that control the shape of the
> rcu_node tree).
> 
> In addition, the old per-CPU srcu_array structure is now named srcu_data
> and contains an rcu_segcblist structure named ->srcu_cblist for its
> callbacks (and a spinlock to protect this).  The srcu_struct gets
> an srcu_gp_seq that is used to associate callback segments with the
> corresponding completion-time grace-period number.  These completion-time
> grace-period numbers are propagated up the srcu_node tree so that the
> grace-period workqueue handler can determine whether additional grace
> periods are needed on the one hand and where to look for callbacks that
> are ready to be invoked.
> 
> The srcu_barrier() function must now wait on all instances of the
> per-CPU ->srcu_cblist.  Because each ->srcu_cblist is protected
> by ->lock, srcu_barrier() can remotely add the needed callbacks.
> In theory, it could also remotely start grace periods, but this gets
> complex and racy.  And interestingly enough, it is never necessary to
> start a grace period in this case because srcu_barrier() only enqueues
> a callback when a callback is already present.  And a grace period has
> to have already been started for this pre-existing callback.  And it is
> only the callback that srcu_barrier() needs to wait on, not any particular
> grace period.  Therefore, a new rcu_segcblist_entrain() function enqueues
> the srcu_barrier() function's callback into the same segment occupied by
> the pre-existing callback.  The special case where all the pre-existing
> callbacks are on a different list being invoked is handled by enqueuing
> srcu_barrier()'s callback into the RCU_DONE_TAIL segment, relying on
> the done-callbacks check that takes place after all callbacks are inovked.
> 
> Note that the readers use the same algorithm as before.  Note that there
> is a separate srcu_idx that tells the readers what counter to increment.
> This unfortunately cannot be combined with srcu_gp_seq because they
> need to be incremented at different times.

So one thing I've asked before I think, would it not be possible to
abstract PREEMPT_RCU and use the exact same code for PREEMPT_RCU and
SRCU ?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ