lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160225172019.GR3522@linux.vnet.ibm.com>
Date:	Thu, 25 Feb 2016 09:20:19 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, mingo@...nel.org,
	jiangshanlai@...il.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
	josh@...htriplett.org, tglx@...utronix.de, rostedt@...dmis.org,
	dhowells@...hat.com, edumazet@...gle.com, dvhart@...ux.intel.com,
	fweisbec@...il.com, oleg@...hat.com, bobby.prani@...il.com
Subject: Re: [PATCH tip/core/rcu 03/13] rcu: Stop treating in-kernel
 CPU-bound workloads as errors

On Thu, Feb 25, 2016 at 10:43:17AM +0100, Peter Zijlstra wrote:
> On Tue, Feb 23, 2016 at 09:12:40PM -0800, Paul E. McKenney wrote:
> > Commit 4a81e8328d379 ("Reduce overhead of cond_resched() checks for RCU")
> > handles the error case where a nohz_full loops indefinitely in the kernel
> > with the scheduling-clock interrupt disabled.  However, this handling
> > includes IPIing the CPU running the offending loop, which is not what
> > we want for real-time workloads.  And there are starting to be real-time
> > CPU-bound in-kernel workloads, and these must be handled without IPIing
> > the CPU, at least not in the common case.  Therefore, this situation can
> > no longer be dismissed as an error case.
> 
> Do explain. Doing "for (;;) ;" in a kernel RT thread is just as bad for
> general system health as is doing the same in userspace.

The use case is instead something like this:

	for (;;) {
		do_something();
		cond_resched_rcu_qs();
	}

If you instead do something like this:

	for (;;)
		do_something();

where do_something() doesn't invoke cond_resched_rcu_qs() often enough,
then your kernel is broken and the warrantee says that you get to keep
the pieces.

> Also, who runs his RT workload in-kernel ?

That would be me, actually.

I use something very much like this in rcutorture and in rcuperf (the
latter currently exists only in -rcu, although 0day has been helpfully
finding various problems with it).  In rcutorture, the problem never
arises given default kernel-boot-parameter settings.  However, you
could easily set various timing parameters to exceed the RCU CPU stall
warning timeout.

In rcuperf, this sort of thing happens by default under heavy load.

So why bother if the use case is this obscure?

Because I have been getting beaten up repeatedly over the past few years
about RCU sending IPIs, so I figured that this time I should at least
-try- to get ahead of the game!  ;-)

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ