[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150722063323.GE23662@Sligo.logfs.org>
Date: Tue, 21 Jul 2015 23:33:23 -0700
From: Jörn Engel <joern@...estorage.com>
To: Mike Galbraith <umgwanakikbuti@...il.com>
Cc: Spencer Baugh <sbaugh@...ern.com>, Don Zickus <dzickus@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Ulrich Obergfell <uobergfe@...hat.com>,
Ingo Molnar <mingo@...nel.org>,
Andrew Jones <drjones@...hat.com>,
chai wen <chaiw.fnst@...fujitsu.com>,
Chris Metcalf <cmetcalf@...hip.com>,
Stephane Eranian <eranian@...gle.com>,
open list <linux-kernel@...r.kernel.org>,
Spencer Baugh <Spencer.baugh@...estorage.com>,
Joern Engel <joern@...fs.org>
Subject: Re: [PATCH] soft lockup: kill realtime threads before panic
On Wed, Jul 22, 2015 at 07:41:48AM +0200, Mike Galbraith wrote:
> On Tue, 2015-07-21 at 22:18 -0700, Jörn Engel wrote:
> >
> > Not sure if this patch is something for mainline, but those two
> > alternatives have problems of their own. Not panicking on lockups can
> > leave a system disabled until some human come around. In many cases
> > that human will do no better than power-cycle. A panic reduces the
> > downtime.
>
> If a realtime task goes bonkers, the realtime game is over, you're down.
Agreed. But a reboot will often solve the issue. So the automatic
panic will repair the system within minutes, while no panic will leave
the system broken for days, depending on human response time. Automatic
panic is a great way to minimize downtime - or vulnerable time if you
have HA.
One could argue that killing the realtime thread is even better than
panic, as things can restart with a blank slate even faster. But the
real benefit is that we get better debug data for the failing component.
If we had a kernel bug, the backtrace would usually be sufficient to
point fingers. With a bonkers realtime thread, not so much.
Anyway, this patch has been useful to us once. If someone deems it
merge-worthy, great. If not, I won't lose any sleep either.
Jörn
--
The key to performance is elegance, not battalions of special cases.
-- Jon Bentley and Doug McIlroy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists