[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1004081405180.8347@chino.kir.corp.google.com>
Date: Thu, 8 Apr 2010 14:08:26 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Oleg Nesterov <oleg@...hat.com>
cc: Andrew Morton <akpm@...ux-foundation.org>,
anfei <anfei.zhou@...il.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
nishimura@....nes.nec.co.jp,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Mel Gorman <mel@....ul.ie>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [patch] oom: give current access to memory reserves if it has
been killed
On Thu, 1 Apr 2010, Oleg Nesterov wrote:
> > > > > Say, oom_forkbomb_penalty() does list_for_each_entry(tsk->children).
> > > > > Again, this is not right even if we forget about !child->mm check.
> > > > > This list_for_each_entry() can only see the processes forked by the
> > > > > main thread.
> > > > >
> > > >
> > > > That's the intention.
> > >
> > > Why? shouldn't oom_badness() return the same result for any thread
> > > in thread group? We should take all childs into account.
> > >
> >
> > oom_forkbomb_penalty() only cares about first-descendant children that
> > do not share the same memory,
>
> I see, but the code doesn't really do this. I mean, it doesn't really
> see the first-descendant children, only those which were forked by the
> main thread.
>
> Look. We have a main thread M and the sub-thread T. T forks a lot of
> processes which use a lot of memory. These processes _are_ the first
> descendant children of the M+T thread group, they should be accounted.
> But M->children list is empty.
>
> oom_forkbomb_penalty() and oom_kill_process() should do
>
> t = tsk;
> do {
> list_for_each_entry(child, &t->children, sibling) {
> ... take child into account ...
> }
> } while_each_thread(tsk, t);
>
>
In this case, it seems more appropriate that we would penalize T and not M
since it's not necessarily responsible for the behavior of the children it
forks. T is the buggy/malicious program, not M.
> See the patch below. Yes, this is minor, but it is always good to avoid
> the unnecessary locks, and thread_group_cputime() is O(N).
>
> Not only for performance reasons. This allows to change the locking in
> thread_group_cputime() if needed without fear to deadlock with task_lock().
>
> Oleg.
>
> --- x/mm/oom_kill.c
> +++ x/mm/oom_kill.c
> @@ -97,13 +97,16 @@ static unsigned long oom_forkbomb_penalt
> return 0;
> list_for_each_entry(child, &tsk->children, sibling) {
> struct task_cputime task_time;
> - unsigned long runtime;
> + unsigned long runtime, this_rss;
>
> task_lock(child);
> if (!child->mm || child->mm == tsk->mm) {
> task_unlock(child);
> continue;
> }
> + this_rss = get_mm_rss(child->mm);
> + task_unlock(child);
> +
> thread_group_cputime(child, &task_time);
> runtime = cputime_to_jiffies(task_time.utime) +
> cputime_to_jiffies(task_time.stime);
> @@ -113,10 +116,9 @@ static unsigned long oom_forkbomb_penalt
> * get to execute at all in such cases anyway.
> */
> if (runtime < HZ) {
> - child_rss += get_mm_rss(child->mm);
> + child_rss += this_rss;
> forkcount++;
> }
> - task_unlock(child);
> }
>
> /*
This patch looks good, will you send it to Andrew with a changelog and
sign-off line? Also feel free to add:
Acked-by: David Rientjes <rientjes@...gle.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists