lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 06 Sep 2012 17:41:01 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	paulmck@...ux.vnet.ibm.com
Cc:	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
	dipankar@...ibm.com, akpm@...ux-foundation.org,
	mathieu.desnoyers@...ymtl.ca, josh@...htriplett.org,
	niv@...ibm.com, tglx@...utronix.de, Valdis.Kletnieks@...edu,
	dhowells@...hat.com, eric.dumazet@...il.com, darren@...art.com,
	fweisbec@...il.com, sbw@....edu, patches@...aro.org,
	"Paul E. McKenney" <paul.mckenney@...aro.org>
Subject: Re: [PATCH tip/core/rcu 11/15] rcu: Avoid spurious RCU CPU stall
 warnings

On Thu, 2012-09-06 at 14:03 -0700, Paul E. McKenney wrote:

> Here are a few other ways that stalls can happen:
> 
> o	A CPU looping in an RCU read-side critical section.

For a minute? That's a bug.

> 	
> o	A CPU looping with interrupts disabled.  This condition can
> 	result in RCU-sched and RCU-bh stalls.

Also a bug.

> 
> o	A CPU looping with preemption disabled.  This condition can
> 	result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh
> 	stalls.

Bug as well.

> 
> o	A CPU looping with bottom halves disabled.  This condition can
> 	result in RCU-sched and RCU-bh stalls.

Bug too.

> 
> o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
> 	without invoking schedule().

Another bug.

> 
> o	A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
> 	happen to preempt a low-priority task in the middle of an RCU
> 	read-side critical section.   This is especially damaging if
> 	that low-priority task is not permitted to run on any other CPU,
> 	in which case the next RCU grace period can never complete, which
> 	will eventually cause the system to run out of memory and hang.
> 	While the system is in the process of running itself out of
> 	memory, you might see stall-warning messages.

Buggy system.

> 
> o	A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that
> 	is running at a higher priority than the RCU softirq threads.
> 	This will prevent RCU callbacks from ever being invoked,
> 	and in a CONFIG_TREE_PREEMPT_RCU kernel will further prevent
> 	RCU grace periods from ever completing.  Either way, the
> 	system will eventually run out of memory and hang.  In the
> 	CONFIG_TREE_PREEMPT_RCU case, you might see stall-warning
> 	messages.

Not really a bug, but the developers need a spanking.

> 
> o	A hardware or software issue shuts off the scheduler-clock
> 	interrupt on a CPU that is not in dyntick-idle mode.  This
> 	problem really has happened, and seems to be most likely to
> 	result in RCU CPU stall warnings for CONFIG_NO_HZ=n kernels.

Driving the bug.

> 
> o	A bug in the RCU implementation.

Bug in the name.

> 
> o	A hardware failure.  This is quite unlikely, but has occurred
> 	at least once in real life.  A CPU failed in a running system,
> 	becoming unresponsive, but not causing an immediate crash.
> 	This resulted in a series of RCU CPU stall warnings, eventually
> 	leading the realization that the CPU had failed.

Hardware bug.

So, where's the "spurious RCU CPU stall warnings"?

All these cases deserve a warning.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ