linux-kernel - Re: [PATCH v2]mm/oom-kill: direct hardware access processes should get bonus

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.1011091307240.7730@chino.kir.corp.google.com>
Date:	Tue, 9 Nov 2010 13:16:12 -0800 (PST)
From:	David Rientjes <rientjes@...gle.com>
To:	"Figo.zhang" <figo1802@...il.com>
cc:	lkml <linux-kernel@...r.kernel.org>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Andrew Morton <akpm@...l.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v2]mm/oom-kill: direct hardware access processes should
 get bonus

On Tue, 9 Nov 2010, Figo.zhang wrote:

>  
> the victim should not directly access hardware devices like Xorg server,
> because the hardware could be left in an unpredictable state, although 
> user-application can set /proc/pid/oom_score_adj to protect it. so i think
> those processes should get 3% bonus for protection.
> 

The logic here is wrong: if killing these tasks can leave hardware in an 
unpredictable state (and that state is presumably harmful), then they 
should be completely immune from oom killing since you're still leaving 
them exposed here to be killed.

So the question that needs to be answered is: why do these threads deserve 
to use 3% more memory (not >4%) than others without getting killed?  If 
there was some evidence that these threads have a certain quantity of 
memory they require as a fundamental attribute of CAP_SYS_RAWIO, then I 
have no objection, but that's going to be expressed in a memory quantity 
not a percentage as you have here.

The CAP_SYS_ADMIN heuristic has a background: it is used in the oom killer 
because we have used the same 3% in __vm_enough_memory() for a long time 
and we want consistency amongst the heuristics.  Adding additional bonuses 
with arbitrary values like 3% of memory for things like CAP_SYS_RAWIO 
makes the heuristic less predictable and moves us back toward the old 
heuristic which was almost entirely arbitrary.

Now before KOSAKI-san comes out and says the old heuristic considered 
CAP_SYS_RAWIO and the new one does not so it _must_ be a regression: the 
old heuristic also divided the badness score by 4 for that capability as a 
completely arbitrary value (just like 3% is here).  Other traits like 
runtime and nice levels were also removed from the heuristic.  What needs 
to be shown is that CAP_SYS_RAWIO requires additional memory just to run 
or we should neglect to free 3% of memory, which could be gigabytes, 
because it has this trait.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/