[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.0907271658550.27881@chino.kir.corp.google.com>
Date: Mon, 27 Jul 2009 17:10:26 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Paul Menage <menage@...gle.com>
cc: Andrew Morton <akpm@...ux-foundation.org>,
Rik van Riel <riel@...hat.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
linux-kernel@...r.kernel.org
Subject: Re: [patch -mmotm] mm: introduce oom_adj_child
On Mon, 27 Jul 2009, Paul Menage wrote:
> On Sun, Jul 26, 2009 at 2:50 PM, David Rientjes<rientjes@...gle.com> wrote:
> > +If oom_adj_child is set to equal oom_adj, then it will mirror oom_adj whenever
> > +it changes. This avoids having to set both values when simply tuning oom_adj
> > +and that value should be inherited by all children.
>
> Maybe have a distinct value for oom_adj_child (the default) that means
> "default to mm->oom_adj" ?
>
That's implicitly what mm->oom_adj == mm->oom_adj_child means. If they
are equal at the time oom_adj is changed, oom_adj_child also changes, but
if oom_adj_child differs then it remains static.
> Shouldn't oom_adj_child be per-task? Otherwise you're theoretically
> allowing races between different threads that try to fork children
> with different oom_adj values at the same time. Not a particularly
> likely problem, but it seems bad to bake the change of races into the
> API.
>
Good point, the newly initialized mm can get its oom_adj value from
current rather than current->mm.
> Also, I'm not sure that the requirement that oom_adj_child be >=
> oom_adj is a good restriction. Sure, if a task gives its child a lower
> oom_adj than itself it's potentially playing with fire, but it may
> well be that the new child is expected todaemonize itself in the very
> near future and hence no longer be the child of the current process. I
> don't think that restricting the values that the sysadmin or root
> processes can apply on the grounds that they might not do what they
> want is the right approach.
>
Ok, we can allow oom_adj_child to be less than oom_adj for
CAP_SYS_RESOURCE.
> It would also maybe be nicer to use a prctl() rather than introducing
> yet another file in /proc/<pid> - but I guess that's a style argument
> rather than a strict technical issue.
>
Right, you had mentioned that to me earlier. I opted to use procfs
because it puts all the tunables in one place so adjusting it from
userspace is easier for applications that care about oom_adj. prctl()
only affects signals and capabilities at the moment and lacks any other
tunables that correspond to functionalities of procfs entities.
Andrew, please disregard this version, I'll be sending a v2 based on
Paul's comments.
Powered by blists - more mailing lists