lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 26 Apr 2017 08:44:02 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Mike Galbraith <efault@....de>
Cc:     LKML <linux-kernel@...r.kernel.org>
Subject: Re: TREE_SRCU slows hotplug by factor ~16

On Wed, Apr 26, 2017 at 05:26:20PM +0200, Mike Galbraith wrote:
> On Wed, 2017-04-26 at 07:31 -0700, Paul E. McKenney wrote:
> 
> > And a sneak preview, semi-tested.  If you get a chance to run this, please
> > let me know now it goes.
> 
> That took 'time stress-cpu-hotplug.sh' down to 48s, close to classic.

Woo-hoo!!!  ;-)

And thank you for your testing efforts!

Should I be comparing this with the 55s number from your initial email,
or to the 39s number?

Either way, given the unusual nature of Steven's hotplug stress test,
I believe that I am good enough for this merge window.  But if we
are talking 48s for Tree SRCU vs. 39s with Classic SRCU, it would be
good to at least understand where the remaining slowdown is.  Here
are a couple of possible causes:

o	My holdoff is too long.  I set it to 50 microseconds based
	on your trace, which shows a minimum grace-period separation
	of 118 microseconds.  But perhaps the trace was too short to
	show the full variation.  One way to check this is to run with
	srcutree.exp_holdoff=25000 or some such.  (Please note that
	srcutree.exp_holdoff is in nanoseconds, -not- microseconds.)

o	My expedited throttling is too aggressive.  This is controlled
	by the following lines of code in srcu_gp_end() in the file
	kernel/rcu/srcutree.c:

		/* Throttle expedited grace periods: Should be rare! */
		srcu_reschedule(sp, rcu_seq_ctr(gpseq) & 0x3ff
				    ? 0 : SRCU_INTERVAL);

	The "0x3ff" says that one in 1024 grace periods should be
	forced to be at least partially non-expedited, regardless
	of anything else.  If making this be (say) "0xfff" gets
	you three-quarters of the way to the 39s, that indicates
	that this is the controlling factor.

o	Of course, another question is how much variation there is
	in the timing of that stress test.

If further reduction is needed, and none of these help, could you
please send me a trace of the full run of the same form as the last
one you sent me, covering calls to and returns from call_srcu(),
synchronize_srcu(), and synchronize_srcu_expedited()?

							Thanx, paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ