[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1002031141350.27853@chino.kir.corp.google.com>
Date: Wed, 3 Feb 2010 11:52:25 -0800 (PST)
From: David Rientjes <rientjes@...gle.com>
To: Frans Pop <elendil@...net.nl>
cc: Balbir Singh <balbir@...ux.vnet.ibm.com>,
Rik van Riel <riel@...hat.com>, l.lunak@...e.cz,
Andrew Morton <akpm@...ux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Nick Piggin <npiggin@...e.de>, jkosina@...e.cz,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: Improving OOM killer
On Wed, 3 Feb 2010, Frans Pop wrote:
> > * /proc/pid/oom_adj ranges from -1000 to +1000 to either
> > * completely disable oom killing or always prefer it.
> > */
> > points += p->signal->oom_adj;
> >
>
> Wouldn't that cause a rather huge compatibility issue given that the
> current oom_adj works in a totally different way:
>
> ! 3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score
> ! ------------------------------------------------------
> ! This file can be used to adjust the score used to select which processes
> ! should be killed in an out-of-memory situation. Giving it a high score
> ! will increase the likelihood of this process being killed by the
> ! oom-killer. Valid values are in the range -16 to +15, plus the special
> ! value -17, which disables oom-killing altogether for this process.
>
> ?
>
I thought about whether we'd need an additional, complementary tunable
such as /proc/pid/oom_bias that would effect this new memory-charging bias
in the heuristic. It could be implemented so that writing to oom_adj
would clear oom_bias and vice versa.
Although that would certainly be possible, I didn't propose it for a
couple of reasons:
- it would clutter the space to have two seperate tunables when the
metrics that /proc/pid/oom_adj uses has become obsolete by the new
baseline as a fraction of total RAM, and
- we have always exported OOM_DISABLE, OOM_ADJUST_MIN, and OOM_ADJUST_MAX
via include/oom.h so that userspace should use them sanely. Setting
a particular oom_adj value for anything other than OOM_DISABLE means
the score will be relative to other system tasks, so its a value that
is typically calibrated at runtime rather than static, hardcoded
values.
We could reuse /proc/pid/oom_adj for the new heuristic by severely
reducing its granularity than it otherwise would by doing
(oom_adj * 1000 / OOM_ADJUST_MAX), but that will eventually become
annoying and much more difficult to document.
Given your citation, I don't think we've ever described /proc/pid/oom_adj
outside of the implementation as a bitshift, either. So its use right now
for anything other than OOM_DISABLE is probably based on scalar thinking.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists