linux-kernel - Re: [patch -mm v2] mm: introduce oom_adj

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.0908030050160.30778@chino.kir.corp.google.com>
Date:	Mon, 3 Aug 2009 00:59:18 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>,
	Paul Menage <menage@...gle.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [patch -mm v2] mm: introduce oom_adj_child

On Mon, 3 Aug 2009, KAMEZAWA Hiroyuki wrote:

> >  - /proc/pid/oom_score is inconsistent when tuning /proc/pid/oom_adj if it
> >    relies on the per-thread oom_adj; it now really represents nothing but
> >    an incorrect value if other threads share that memory and misleads the
> >    user on how the oom killer chooses victims, or
> 
> What's why I said to show effective_oom_adj if necessary..
> 

Right, but which of the following two behaviors do you believe the 
majority of today's user applications are written to use?

 (1) /proc/pid/oom_score represents the badness heuristic that the oom
     killer uses to determine which task to kill, or

 (2) /proc/pid/oom_adj can be adjusted after vfork() and prior to exec()
     to represent the oom preference of the child without simultaneously
     changing the oom preference of the parent.

The two are at a complete contrast and cannot co-exist.  I favor behavior 
(1), which is why my patches make it consistent in _all_ cases, since it 
is more likely than not that the majority of user applications use that 
behavior if, for no other reason, than it is the DOCUMENTED reason.

If you feel that's an unreasonable conclusion, then please say that so 
your argument can be judged based on your interpretation of that behavior 
which I believe most others would disagree with.  Otherwise, our 
discussion will continue to go in circles.

> >  - /proc/pid/oom_score is inconsistent when the thread that set the
> >    effective per-mm oom_adj exits and it is now obsolete since you have
> >    no way to determine what the next effective oom_adj value shall be.
> > 
> plz re-caluculate it. it's not a big job if done in lazy way.
> 

You can't recalculate it if all the remaining threads have a different 
oom_adj value than the effective oom_adj value from the thread that is now 
exited.  There is no assumption that, for instance, the most negative 
oom_adj value shall then be used.  Imagine the effective oom_adj value 
being +15 and a thread sharing the same memory has an oom_adj value of 
-16.  Under no reasonable circumstance should the oom preference of the 
entire thread then change to -16 just because its the side-effect of a 
thread exiting.

That's the _entire_ reason why we need consistency in oom_adj values so 
that userspace is aware of how the oom killer really works and chooses 
tasks.  I understand that it differs from the previously allowed behavior, 
but those userspace applications need to be fixed if, for no other reason, 
they are now consistent with how the oom killer kills tasks.  I think 
that's a very worthwhile goal and the cost of moving to a new interface 
such as /proc/pid/oom_adj_child to have the same inheritance property that 
was available in the past is justified.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/