linux-kernel - Re: [PATCH 4/5] don't panic if /sbin/init exits or killed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080316230346.GA379@tv-sign.ru>
Date:	Mon, 17 Mar 2008 02:03:46 +0300
From:	Oleg Nesterov <oleg@...sign.ru>
To:	Roland McGrath <roland@...hat.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Davide Libenzi <davidel@...ilserver.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Ingo Molnar <mingo@...e.hu>,
	Laurent Riffard <laurent.riffard@...e.fr>,
	Pavel Emelyanov <xemul@...nvz.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/5] don't panic if /sbin/init exits or killed

On 03/16, Roland McGrath wrote:
>

(re-ordered)

> Have you tested how recoverable it really is?  I wonder what happens
> with init having exited when things get reparented to it.  Don't the
> zombies just pile up?

Yes sure, we leak the re-parented zombies, and nobody can take care of
/etc/inittab. As expected.

But otherwise the system runs fine.

> BUG() does not seem right to me.  This does not diagnose any kernel bug.
> The kernel source location and backtrace are not useful.  In fact, they
> are likely to mislead the user into reporting the bug to the wrong place
> (because it will look like a kernel bug).

But panic() isn't better? It doesn't provide any useful info.

> I gather your motivation is to get something "recoverable" rather than
> always rebooting.  This might be useful for developers like you and me.
> I suspect that conservative administrators of production systems prefer
> the current behavior.  If the boot init dies, that is reasonably likely
> to be a "catastrophic" failure of the system as a whole as far as the
> proprietor of a production system is concerned.  That is, the system may
> no longer behave as expected in ways essential for its normal operation.
> If it sticks around in that condition, appearing to be available but not
> doing everything it should, that is usually worse than a quick and
> orderly crash (which the installation's procedures and monitoring
> infrastructure are often prepared to handle).

Well, I think the generic "if we have a chance to survive, we should try
to survive" rule is good.

If the boot init dies, at least the admin has a chance to figure out what
has happened, and -o remount,ro /.

Every BUG/BUG_ON in fact means the system is not useable, but still it does
not panic(), but tries to proceed.

In short, I can't see why panic() is better. Except we have panic_timeout,
but we can take it into account if init exits.

> panic is a bit extreme for the situation, where we have no reason yet to
> think kernel data structures are inconsistent.  A sync+reboot or sync+crash
> without bust_spinlocks et al might be better.
> 
> For letting init die and calling it recoverable for hacking purposes, a
> sysctl to disable the panic/crash makes sense.  But I don't think we
> should change the default setting.

OK, I won't argue (not that I agree ;).

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/