netdev - Re: RFC: pid "ownership" of ip config information

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <4D3C1FF5.2010607@gmail.com>
Date:	Sun, 23 Jan 2011 13:32:53 +0100
From:	Nicolas de Pesloüan 
	<nicolas.2p.debian@...il.com>
To:	Patrick Schaaf <netdev@....de>
CC:	netdev@...r.kernel.org
Subject: Re: RFC: pid "ownership" of ip config information

Le 23/01/2011 11:24, Patrick Schaaf a écrit :
> On Fri, 2011-01-21 at 11:17 +0100, Nicolas de Pesloüan wrote:
>> Le 21/01/2011 10:28, Patrick Schaaf a écrit :
>>> The alternative to such a feature, would be to have an additional
>>> monitoring process, which would watch the PID somehow, and need to
>>> be configured to know what to withdraw when it dies.
>
>> There exists some user space clustering system that should provide the same functionalities. Did you
>> had a look at http://www.linux-ha.org/ ?
>
> Those would be the more complex instances of "an additional monitoring
> process", right?
 >
> What happens when heartbeat is "kill -9"ed? Assume that I want to avoid
> STOMITH like approaches.
 >
> My proposal could be _used_ by such complex clustering managers, too.
>
> Or, did I overlook there a kernel based solution to "withdraw IP config
> when processes die"?
 >
> Can you provide a direct link on linux-ha?

Do you consider "withdraw IP config" the only feature that is needed when a process die ? Or shall 
we instead design a more generic framework to run a command or call a system call when a process die 
? /sbin/init is probably already doing something similar. Arguably, even init mail hang...

If your point is to provide a safety net for very sick but not really died node, then, no userland 
system would help. As such, I agree with you that an automatic withdraw of IP config might help. 
However, how would you protect against a simple never ending loop in the process or against very 
slow process due to high load on the node? You probably also need to guard against process not 
reading the network receive queue anymore.

This might end up with some sort of local heart beating monitoring of userland process, in the 
kernel, and I'm not sure if someone would support this.

And whatever you do locally to a node to ensure proper operation, you need a way to also check for 
proper operation from outside of the node. A STOMITH system is always required, in order to kill a 
totally mad node. Even the kernel may become mad.

	Nicolas.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html