[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090122101424.GA12317@ioremap.net>
Date: Thu, 22 Jan 2009 13:14:24 +0300
From: Evgeniy Polyakov <zbr@...emap.net>
To: David Rientjes <rientjes@...gle.com>
Cc: Nikanth Karthikesan <knikanth@...e.de>,
Andrew Morton <akpm@...ux-foundation.org>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Chris Snook <csnook@...hat.com>,
Arve Hjønnevåg <arve@...roid.com>,
Paul Menage <menage@...gle.com>,
containers@...ts.linux-foundation.org
Subject: Re: [RFC] [PATCH] Cgroup based OOM killer controller
On Thu, Jan 22, 2009 at 02:00:55AM -0800, David Rientjes (rientjes@...gle.com) wrote:
>
> In an exclusive cpuset, a task's memory is restricted to a set of mems
> that the administrator has designated. If it is oom, the kernel must free
> memory on those nodes or the next allocation will again trigger an oom
> (leading to a needlessly killed task that was in a disjoint cpuset).
>
> Really.
The whole point of oom-killer is to kill the most appropriate task to
free the memory. And while task is selected system-wide and some
tunables are added to tweak the behaviour local to some subsystems, this
cpuset feature is hardcoded into the selection algorithm.
And when some tunable starts doing own calculation, behaviour of this
hardcoded feature changes.
This is intended to change it. Because admin has to have ability to tune
system the way he needs and not some special hueristics, which may not
work all the time.
That is the point against cpuset argument. Make it tunable the same way
we have oom_adj and/or this cgroup order feature.
> > In this case administrator will not do this. It is up to him to decide
> > and not some inner kernel policy.
> >
>
> Then the scope of this new cgroup is restricted to not being used with
> cpusets that could oom.
These are perpendicular tasks - cpusets limit one area of the oom
handling, cgroup order - another. Some people needs cpusets, others want
cgroups. cpusets are not something exceptional so that only they have to
be taken into account when doing system-wide operation like OOM
condition handling.
--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists