lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 20 Feb 2012 17:50:37 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Lai Jiangshan <laijs@...fujitsu.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...ymtl.ca,
	josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
	peterz@...radead.org, rostedt@...dmis.org, Valdis.Kletnieks@...edu,
	dhowells@...hat.com, eric.dumazet@...il.com, darren@...art.com,
	fweisbec@...il.com, patches@...aro.org
Subject: Re: [PATCH RFC tip/core/rcu] rcu: direct algorithmic SRCU
 implementation

On Tue, Feb 21, 2012 at 09:11:47AM +0800, Lai Jiangshan wrote:
> On 02/21/2012 01:44 AM, Paul E. McKenney wrote:
> 
> > 
> >> My conclusion, we can just remove the check-and-return path to reduce
> >> the complexity since we will introduce call_srcu().
> > 
> > If I actually submit the above upstream, that would be quite reasonable.
> > My thought is that patch remains RFC and the upstream version has
> > call_srcu().
> 
> Does the work of call_srcu() is started or drafted?

I do have a draft design, and am currently beating it into shape.
No actual code yet, though.  The general idea at the moment is as follows:

o	The state machine must be preemptible.	I recently received
	a bug report about 200-microsecond latency spikes on a system
	with more than a thousand CPUs, so the summation of the per-CPU
	counters and subsequent recheck cannot be in a preempt-disable
	region.  I am therefore currently thinking in terms of a kthread.

o	At the moment, having a per-srcu_struct kthread seems excessive.
	I am planning on a single kthread to do the counter summation
	and checking.  Further parallelism might be useful in the future,
	but I would want to see someone run into problems before adding
	more complexity.

o	There needs to be a linked list of srcu_struct structures so
	that they can be traversed by the state-machine kthread.

o	If there are expedited SRCU callbacks anywhere, the kthread
	would scan through the list of srcu_struct structures quickly
	(perhaps pausing a few microseconds between).  If there are no
	expedited SRCU callbacks, the kthread would wait a jiffy or so
	between scans.

o	If a given srcu_struct structure has been scanned too many times
	(say, more than ten times) while waiting for the counters to go
	to zero, it loses expeditedness.  It makes no sense for the kthread
	to go CPU-bound just because some SRCU reader somewhere is blocked
	in its SRCU read-side critical section.

o	Expedited SRCU callbacks cannot be delayed by normal SRCU
	callbacks, but neither can expedited callbacks be allowed to
	starve normal callbacks.  I am thinking in terms of invoking these
	from softirq context, with a pair of multi-tailed callback queues
	per CPU, stored in the same structure as the per-CPU counters.

o	There are enough srcu_struct structures in the Linux that
	it does not make sense to force softirq to dig through them all
	any time any one of them has callbacks ready to invoke.  One way
	to deal with this is to have a per-CPU set of linked lists of
	of srcu_struct_array structures, so that the kthread enqueues
	a given structure when it transitions to having callbacks ready
	to invoke, and softirq dequeues it.  This can be done locklessly
	given that there is only one producer and one consumer.

o	We can no longer use the trick of pushing callbacks to another
	CPU from the CPU_DYING notifier because it is likely that CPU
	hotplug will stop using stop_cpus().  I am therefore thinking
	in terms of a set of orphanages (two for normal, two more for
	expedited -- one set of each for callbacks ready to invoke,
	the other for still-waiting callbacks).

o	There will need to be an srcu_barrier() that can be called
	before cleanup_srcu_struct().  Otherwise, someone will end up
	freeing up an srcu_struct that still has callbacks outstanding.

But what did you have in mind?

> >> This new srcu is very great, especially the SRCU_USAGE_COUNT for every
> >> lock/unlock witch forces any increment/decrement pair changes the counter
> >> for me.
> > 
> > Glad you like it!  ;-)
> > 
> > And thank you for your review and feedback!
> 
> Could you add my Reviewed-by when this patch is last submitted?
> 
> 
> Reviewed-by: Lai Jiangshan <laijs@...fujitsu.com>

Will do, thank you!

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ