[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.0901131104420.8522@chino.kir.corp.google.com>
Date: Tue, 13 Jan 2009 11:15:26 -0800 (PST)
From: David Rientjes <rientjes@...gle.com>
To: Evgeniy Polyakov <zbr@...emap.net>
cc: Alan Cox <alan@...rguk.ukuu.org.uk>, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: Linux killed Kenny, bastard!
On Tue, 13 Jan 2009, Evgeniy Polyakov wrote:
> Should this explain why ssh is killed?
>
If you would like to make sshd immune from the oom killer, use
echo -17 > /proc/$(pidof sshd)/oom_adj
just like any other task. This score will be inherited by any task that
it executes, so you'll probably want to readjust your shell's oom_adj
score appropriately in your rc file.
> It is very subtle approach. Consider the case when you have a pool of
> threads/processes which are created and released on demand, there are
> several such pools for different servers and you do know which one
> will very likely being guilty.
>
> Who should adjust the scores for newly created processes? Who should
> check that processes in the first group have negative oom ajustment and
> in the second group a positive value? Who determines when its time to
> ajust the scores?
>
It is userspace's responsibility to set the policy, the kernel merely
provides the mechanism.
> > With oom_adj scores, you have the ability to specify oom kill preferences
> > within a cpuset or memory controller as well, whereas oom_victim_name is
> > global and very costly when not found in select_bad_process().
> >
You chose not to respond to this, which is a major flaw in your approach.
Your patch makes cpuset and memory controller oom killing much slower
because it requires two iterations through the system tasklist when your
global oom_victim_name task is either not running or in a disjoint cpuset
or memcg.
> It is not really costly, since most of the time we skip an entry and do
> not lock the task and do not calculate its badness value. No one scares
> that 'ps ax' is costly because it has to run through all the processes.
>
Talk to SGI about oom killer tasklist scans for their large systems; it
was a prerequisite for me to provide /proc/sys/vm/oom_kill_allocating_task
to avoid a single scan when I made cpuset-constrained ooms go through
select_bad_process().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists