[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1002101405530.29007@chino.kir.corp.google.com>
Date: Wed, 10 Feb 2010 14:25:10 -0800 (PST)
From: David Rientjes <rientjes@...gle.com>
To: Lubos Lunak <l.lunak@...e.cz>
cc: Balbir Singh <balbir@...ux.vnet.ibm.com>,
Rik van Riel <riel@...hat.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Nick Piggin <npiggin@...e.de>, Jiri Kosina <jkosina@...e.cz>
Subject: Re: Improving OOM killer
On Wed, 10 Feb 2010, Lubos Lunak wrote:
> > Yes, forkbombs are not always malicious, they can be the result of buggy
> > code and there's no other kernel mechanism that will hold them off so that
> > the machine is still usable. If a task forks and execve's thousands of
> > threads on your 2GB desktop machine either because its malicious, its a
> > bug, or a the user made a mistake, that's going to be detrimental
> > depending on the nature of what was executed especially to your
> > interactivity :) Keep in mind that the forking parent such as a job
> > scheduler or terminal and all of its individual children may have very
> > small rss and swap statistics, even though cumulatively its a problem.
>
> Which is why I suggested summing up the memory of the parent and its
> children.
>
That's almost identical to the current heuristic where we sum half the
size of the children's VM size, unfortunately it's not a good indicator of
forkbombs since in your particular example it would be detrimental to
kdeinit. My heursitic considers runtime of the children as an indicator
of a forkbombing parent since such tasks don't typically get to run
anyway. The rss or swap usage of a child with a seperate address space
simply isn't relevant to the badness score of the parent, it unfairly
penalizes medium/large server jobs.
> > We can't address recursive forkbombing in the oom killer with any
> > efficiency, but luckily those cases aren't very common.
>
> Right, I've never run a recursive make that brought my machine to its knees.
> Oh, wait.
>
That's completely outside the scope of the oom killer, though: it is _not_
the oom killer's responsibility for enforcing a kernel-wide forkbomb
policy, which would be much better handled at execve() time.
It's a very small part of my badness heuristic, depending on the average
size of the children's rss and swap usage, because we want to slightly
penalize tasks that fork an extremely large number of tasks that have no
substantial runtime; memory is being consumed but very little work is
getting done by those thousand children. This would most often than not
be used only to break ties when two parents have similar memory
consumption themselves but one is obviously oversubscribing the system.
> And why exactly is iterating over 1st level children efficient enough and
> doing that recursively is not? I don't find it significantly more expensive
> and badness() is hardly a bottleneck anyway.
>
If we look at children's memory usage recursively, then we'll always end
up selecting init_task.
> > The memory consumption of these children were not considered in my rough
> > draft, it was simply a counter of how many first-generation children each
> > task has.
>
> Why exactly do you think only 1st generation children matter? Look again at
> the process tree posted by me and you'll see it solves nothing there. I still
> fail to see why counting also all other generations should be considered
> anything more than a negligible penalty for something that's not a bottleneck
> at all.
>
You're specifying a problem that is outside the scope of the oom killer,
sorry.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists