linux-kernel - Re: [PATCH 4.3-rc6] proc: fix oom_adj value read from /proc/<pid>/oom

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.10.1510211352570.22059@chino.kir.corp.google.com>
Date:	Wed, 21 Oct 2015 13:59:01 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Hongjie Fang (方洪杰) 
	<Hongjie.Fang@...eadtrum.com>
cc:	"ebiederm@...ssion.com" <ebiederm@...ssion.com>,
	"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 4.3-rc6] proc: fix oom_adj value read from
 /proc/<pid>/oom_adj

On Tue, 20 Oct 2015, Hongjie Fang (方洪杰) wrote:

> The oom_adj's value reading through /proc/<pid>/oom_adj is different 
> with the value written into /proc/<pid>/oom_adj. 

/proc/pid/oom_adj is deprecated and has been for years.  When writing to 
/proc/pid/oom_adj, for legacy purposes, the value is converted to 
/proc/pid/oom_score_adj.  There is no exact way to do this since the 
scales of the tunables are different (the former acted as a simple bit 
shift on a badness score, the latter is a proportion of available memory).

You'll notice we never store the written oom_adj, and that's because after 
the conversion to oom_score_adj is done, in units the oom killer actually 
uses to make killing decisions, it is no longer interesting.  Userspace 
needs to only know what the effective policy is, and that may be different 
because there is no 1:1 mapping for tunables of different units.

Rounding up positive oom_adj values and rounding down negative oom_adj 
values, as your patch does, creates an inconsistency in how the mapping 
has been done for years.  It risks current users biasing against their 
processes more than expected, so it's not a safe change to make as Eric 
also suggested.