linux-kernel - Re: Improving OOM killer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.1002031141350.27853@chino.kir.corp.google.com>
Date:	Wed, 3 Feb 2010 11:52:25 -0800 (PST)
From:	David Rientjes <rientjes@...gle.com>
To:	Frans Pop <elendil@...net.nl>
cc:	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Rik van Riel <riel@...hat.com>, l.lunak@...e.cz,
	Andrew Morton <akpm@...ux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Nick Piggin <npiggin@...e.de>, jkosina@...e.cz,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: Improving OOM killer

On Wed, 3 Feb 2010, Frans Pop wrote:

> > * /proc/pid/oom_adj ranges from -1000 to +1000 to either
> > * completely disable oom killing or always prefer it.
> > */
> > points += p->signal->oom_adj;
> > 
> 
> Wouldn't that cause a rather huge compatibility issue given that the 
> current oom_adj works in a totally different way:
> 
> ! 3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score
> ! ------------------------------------------------------
> ! This file can be used to adjust the score used to select which processes
> ! should be killed in an  out-of-memory  situation.  Giving it a high score
> ! will increase the likelihood of this process being killed by the
> ! oom-killer.  Valid values are in the range -16 to +15, plus the special
> ! value -17, which disables oom-killing altogether for this process.
> 
> ?
> 

I thought about whether we'd need an additional, complementary tunable 
such as /proc/pid/oom_bias that would effect this new memory-charging bias 
in the heuristic.  It could be implemented so that writing to oom_adj 
would clear oom_bias and vice versa.

Although that would certainly be possible, I didn't propose it for a 
couple of reasons:

 - it would clutter the space to have two seperate tunables when the 
   metrics that /proc/pid/oom_adj uses has become obsolete by the new
   baseline as a fraction of total RAM, and

 - we have always exported OOM_DISABLE, OOM_ADJUST_MIN, and OOM_ADJUST_MAX
   via include/oom.h so that userspace should use them sanely.  Setting
   a particular oom_adj value for anything other than OOM_DISABLE means 
   the score will be relative to other system tasks, so its a value that 
   is typically calibrated at runtime rather than static, hardcoded 
   values.

We could reuse /proc/pid/oom_adj for the new heuristic by severely 
reducing its granularity than it otherwise would by doing
(oom_adj * 1000 / OOM_ADJUST_MAX), but that will eventually become 
annoying and much more difficult to document.

Given your citation, I don't think we've ever described /proc/pid/oom_adj 
outside of the implementation as a bitshift, either.  So its use right now 
for anything other than OOM_DISABLE is probably based on scalar thinking.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/