lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 22 Jul 2015 07:41:48 +0200
From:	Mike Galbraith <umgwanakikbuti@...il.com>
To:	Jörn Engel <joern@...estorage.com>
Cc:	Spencer Baugh <sbaugh@...ern.com>, Don Zickus <dzickus@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ulrich Obergfell <uobergfe@...hat.com>,
	Ingo Molnar <mingo@...nel.org>,
	Andrew Jones <drjones@...hat.com>,
	chai wen <chaiw.fnst@...fujitsu.com>,
	Chris Metcalf <cmetcalf@...hip.com>,
	Stephane Eranian <eranian@...gle.com>,
	open list <linux-kernel@...r.kernel.org>,
	Spencer Baugh <Spencer.baugh@...estorage.com>,
	Joern Engel <joern@...fs.org>
Subject: Re: [PATCH] soft lockup: kill realtime threads before panic

On Tue, 2015-07-21 at 22:18 -0700, Jörn Engel wrote:
> On Wed, Jul 22, 2015 at 06:36:30AM +0200, Mike Galbraith wrote:
> > On Tue, 2015-07-21 at 15:07 -0700, Spencer Baugh wrote:
> > 
> > > We have observed cases where the soft lockup detector triggered, but no
> > > kernel bug existed.  Instead we had a buggy realtime thread that
> > > monopolized a cpu.  So let's kill the responsible party and not panic
> > > the entire system.
> > 
> > If you don't tell the kernel to panic, it won't, and if you don't remove
> > its leash (the throttle), your not so tame rt beast won't maul you.
> 
> Not sure if this patch is something for mainline, but those two
> alternatives have problems of their own.  Not panicking on lockups can
> leave a system disabled until some human come around.  In many cases
> that human will do no better than power-cycle.  A panic reduces the
> downtime.

If a realtime task goes bonkers, the realtime game is over, you're down.
 
> And the realtime throttling gives non-realtime threads some minimum
> runtime, but does nothing to help low-priority realtime threads.  It
> also introduces latencies, often when workloads are high and you would
> like any available cpu to get through that rough spot.

You can use group scheduling as a debug crutch until the little beasts
are housebroken.

> I don't think we have a good answer to this problem in the mainline
> kernel yet.

IMHO, there's no point in trying to make rt warm/fuzzy/cuddly.  Just
don't stuff a Hells Angel into a super-suit, that gets real ugly ;-)

	-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ