linux-kernel - Re: Memory overcommit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.0910300200170.18076@chino.kir.corp.google.com>
Date:	Fri, 30 Oct 2009 02:10:37 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
cc:	vedran.furac@...il.com, Hugh Dickins <hugh.dickins@...cali.co.uk>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	minchan.kim@...il.com, Andrew Morton <akpm@...ux-foundation.org>,
	Andrea Arcangeli <aarcange@...hat.com>
Subject: Re: Memory overcommit

On Fri, 30 Oct 2009, KAMEZAWA Hiroyuki wrote:

> As I wrote repeatedly,
> 
>    - OOM-Killer itselfs is bad thing, bad situation.

Not necessarily, the memory controller and cpusets uses it quite often to 
enforce it's policy and is standard runtime behavior.  We'd like to 
imagine that our cpuset will never be too small to run all the attached 
jobs, but that happens and we can easily recover from it by killing a 
task.

>    - The kernel can't know the program is bad or not. just guess it.

Totally irrelevant, given your fourth point about /proc/pid/oom_adj.  We 
can tell the kernel what we'd like the oom killer behavior should be if 
the situation arises.

>    - Then, there is no "correct" OOM-Killer other than fork-bomb killer.

Well of course there is, you're seeing this is a WAY too simplistic 
manner.  If we are oom, we want to be able to influence how the oom killer 
behaves and respond to that situation.  You are proposing that we change 
the baseline for how the oom killer selects tasks which we use CONSTANTLY 
as part of our normal production environment.  I'd appreciate it if you'd 
take it a little more seriously.

>    - User has a knob as oom_adj. This is very strong.
> 

Agreed.

> Then, there is only "reasonable" or "easy-to-understand" OOM-Kill.
> "Current biggest memory eater is killed" sounds reasonable, easy to
> understand. And if total_vm works well, overcommit_guess should catch it.
> Please improve overcommit_guess if you want to stay on total_vm.
> 

I don't necessarily want to stay on total_vm, but I also don't want to 
move to rss as a baseline, as you would probably agree.

We disagree about a very fundamental principle: you are coming from a 
perspective of always wanting to kill the biggest resident memory eater 
even for a single order-0 allocation that fails and I'm coming from a 
perspective of wanting to ensure that our machines know how the oom killer 
will react when it is used.  Moving to rss reduces the ability of the user 
to specify an expected oom priority other than polarizing it by either 
disabling it completely with an oom_adj value of -17 or choosing the 
definite next victim with +15.  That's my objection to it: the user cannot 
possibly be expected to predict what proportion of each application's 
memory will be resident at the time of oom.

I understand you want to totally rewrite the oom killer for whatever 
reason, but I think you need to spend a lot more time understanding the 
needs that the Linux community has for its behavior instead of insisting 
on your point of view.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/