[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090113133555.GA28107@ioremap.net>
Date: Tue, 13 Jan 2009 16:35:55 +0300
From: Evgeniy Polyakov <zbr@...emap.net>
To: Theodore Tso <tytso@....edu>, Alan Cox <alan@...rguk.ukuu.org.uk>,
David Rientjes <rientjes@...gle.com>,
Bill Davidsen <davidsen@....com>, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: Linux killed Kenny, bastard!
On Tue, Jan 13, 2009 at 08:19:37AM -0500, Theodore Tso (tytso@....edu) wrote:
> Instead of trying to specify which process should be protected from
> the OOM killer by name, how about something which is inherited from
> the parent process? After all, if having the child not get killed due
> to OOM is important, the child won't even have a chance to run if the
> parent gets killed off. And in fact, we have something that fits that
> bill fairly well; getrlimit()/setrlimit(). Why not define a new
> resource limit which specifies a relative immunity to the oom_killer?
>
> Most of the infrastructure to support that will already be in place
> (i.e., shell support, PAM support in /etc/securitylimits.conf); all
> that would need to be done is to teach a few userspace
> programs/libraries about the new resource limit.
>
> This would be a much cleaner approach, I would think.
It will be similar to oom_adj parameter (although I did not find where
it is inherited from the parent), but with the different updating
interface. I do not think it will be anyhow easier to solve the problem,
since it is not directly in the parent/child hierarchy, since there are
cases when we do want to kill children (this phrase just screams for the
addition: and eat them), but only some processes which are not really
the most significant.
Existing oom score adjustment mechanism works for this cases, but it is
by itself is not convenient to be used. Even its documentation does not
say how it is used :) It is not just simple add/remove, but score
multiplication or division by the two in the power of the oom_adj value.
Plus really no one knows how scores are calculated except those who read
the mm/kill.c before going to sleep.
So effectively oom_adj only works as enable/disable switch, and since no
one knows how to tune it, it is better to do not touch at all. And get
ssh killed. I believe if it is ever used then only to disable oom at
all, which is wrong, since task still may be killed but after some
others. My patch adds a simple priority for that based on the name of
the process, which are known to the administrators who maintain given
system.
--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists