[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0812290930550.15026@asgard.lang.hm>
Date: Mon, 29 Dec 2008 09:36:11 -0800 (PST)
From: david@...g.hm
To: Oleg Nesterov <oleg@...hat.com>
cc: Scott James Remnant <scott@...onical.com>,
Roland McGrath <roland@...hat.com>,
lkml <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC][PATCH] Notify init when processes are reparented to it
On Mon, 29 Dec 2008, Oleg Nesterov wrote:
> On 12/29, Scott James Remnant wrote:
>>
>> On Mon, 2008-12-29 at 14:23 +0100, Oleg Nesterov wrote:
>>
>>>> We want to be able to supervise daemons.
>>>
>>> What do you mean?
>>>
>>>> Later on, 1002 will die and init will receive SIGCHLD for it.
>>>>
>>>> Unfortunately neither the 1001 or 1002 processes are known to init, even
>>>> though they are original children of the process it spawned (1000), for
>>>> init to be notified about them - this has been forgotten.
>>>
>>> Ok, with this patch /sbin/init knows that 1002 is a descendant
>>> of apache(1000) which was spwaned by init. What can init do
>>> with this info?
>>>
>> Fundamentally init would now know that the apache service terminated,
>> and with what exit code or by what signal.
>>
>> Right now, all we know is that a process terminated (and why) - we can't
>> link that back to a service in any kind of foolproof manner.
>>
>> With the ability to do that, when the apache service dies, we can log
>> that in a more useful manner (including marking the service as down) -
>> but most importantly, we can respawn it!
>
> Aha, thanks, I suspected something like this.
>
> But how can we know (in general) that the service has died? We only
> know that the process has exited.
>
> IOW. when apache(1001) exits, we don't respawn. How can init know that
> the death of apache(1002) means "this is the real exit of service, the
> previous exits were due to initialization stage".
>
> The only answer I can see is: because init can figure out that all
> descendants of initially spawned apache(1000) have exited. But this
> doesn't look very flexible/reliable to me.
there are a number of options here.
init can look for error exit codes and respawn things that died with
errors.
init can be taught what normal behavior is for different daemons (some
sort of configuration options)
init can look for cases where all children have exited.
init could do monitoring of aplications, and if an application is deemed
'sick' can make sure that it kills all processes associated with that
application before trying to respawn it. (this is much more then what init
has doesn historicly, and may not be a good idea for the general case, but
it is still a possibility that will make sense in some cases)
I'm sure that there are other things that can be done with such a
mechansims, this is just what I can think of off the top of my head
> And we already have CONFIG_PROC_EVENTS, init can monitor PROC_EVENT_FORK
> events, so it can do this without ->adopt_signal ?
wouldn't that require init to pay attention to every fork in the system to
find the ones that it cares about?
also, an earlier post gave one reason for wanting to use signals being to
eliminate race conditions.
David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists