[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090127215118.GA12431@ioremap.net>
Date: Wed, 28 Jan 2009 00:51:18 +0300
From: Evgeniy Polyakov <zbr@...emap.net>
To: David Rientjes <rientjes@...gle.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Alan Cox <alan@...rguk.ukuu.org.uk>, balbir@...ux.vnet.ibm.com,
Nikanth Karthikesan <knikanth@...e.de>,
containers@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org,
Torvalds <torvalds@...ux-foundation.org>,
Arve Hj?nnev?g <arve@...roid.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Chris Snook <csnook@...hat.com>,
Paul Menage <menage@...gle.com>
Subject: Re: [RFC] [PATCH] Cgroup based OOM killer controller
On Tue, Jan 27, 2009 at 12:37:21PM -0800, David Rientjes (rientjes@...gle.com) wrote:
> > Well, oom-killer can, since it drops unkillable state from the process
> > mask, that may be not enough though, but it tries more than userspace.
> >
>
> The only thing it does is send a SIGKILL and gives the thread access to
> memory reserves with TIF_MEMDIE, it doesn't drop any unkillable state. If
There is a small difference between force_sig_info() and usual
send_sinal() used by kill.
> its victim is hung in D state and the memory reserves do not allow it to
> return to being runnable, this task will not die and the oom killer would
> livelock unless given another target.
D-states are different. In the current tree we even have
page_lock_killable(), so it depends.
> > My main point was to haev a way to monitor memory usage and that any
> > process could tune own behaviour according to that information. Which is
> > not realated to the system oom-killer at all. Thus /dev/mem_notify is
> > interested first (and only the first) as a memory usage notification
> > interface and not a way to invoke any kind of 'soft' oom-killer.
>
> It's a way to prevent invoking the kernel oom killer by allowing userspace
> notification of events where methods such as droping caches, elevating
> limits, adding nodes, sending signals, etc, can prevent such a problem.
> When the system (or cgroup) is completely oom, it can also issue SIGKILLs
> that will free some memory and preempt the oom killer from acting.
>
> I think there might be some confusion about my proposal for extending
> /dev/mem_notify. Not only should it notify of certain low memory events,
> but it should also allow userspace notification of oom events, just like
> the cgroup oom notifier patch allowed. Instead of attaching a task to a
> cgroup file in that case, however, this would simply be the responsibility
> of a task that has set up a poll() on the cgroup's mem_notify file. A
> configurable delay could be imposed so page allocation attempts simply
> loop while the userspace handler responds and then only invoke the oom
> killer when absolutely necessary.
I have really no objections against this and extending oom-killer to
allow to wait a bit in the allocation path before userspace makes some
progress. But do not drop existing oom-killer (i.e. its ability to kill
processes) in favour of this new feature. Let's have both and if
extension failed for some reason, old oom-killer will do the things.
--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists