linux-kernel - Re: [RFC 1/3] oom, sysrq: Skip over oom victims and killed tasks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.10.1601191458100.7346@chino.kir.corp.google.com>
Date:	Tue, 19 Jan 2016 15:01:44 -0800 (PST)
From:	David Rientjes <rientjes@...gle.com>
To:	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>
cc:	Michal Hocko <mhocko@...nel.org>, linux-mm@...ck.org,
	Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC 1/3] oom, sysrq: Skip over oom victims and killed tasks

On Fri, 15 Jan 2016, One Thousand Gnomes wrote:

> > > I think it's time to kill sysrq+F and I'll send those two patches
> > > unless there is a usecase I'm not aware of.
> > 
> > I have described one in the part you haven't quoted here. Let me repeat:
> > : Your system might be trashing to the point you are not able to log in
> > : and resolve the situation in a reasonable time yet you are still not
> > : OOM. sysrq+f is your only choice then.
> > 
> > Could you clarify why it is better to ditch a potentially usefull
> > emergency tool rather than to make it work reliably and predictably?
> 
> Even if it doesn't work reliably and predictably it is *still* better
> than removing it as it works currently. Today we have "might save you a
> reboot", the removal turns it into "you'll have to reboot". That's a
> regression.
> 

Under what circumstance are you supposing to use sysrq+f in your 
hypothetical?  If you have access to the shell, then you can kill any 
process at random (and you may even be able to make better realtime 
decisions than the oom killer) and it will gain access to memory reserves 
immediately under my proposal when it tries to allocate memory.  The net 
result is that calling the oom killer is no better than you issuing the 
SIGKILL yourself.

This doesn't work if your are supposing to use sysrq+f without the ability 
to get access to the shell.  That's the point, I believe, that Michal has 
raised in this thread.  I'd like to address that issue directly rather 
than requiring human intervention to fix.  If you have deployed a very 
large number of machines to your datacenters, you don't possibly have the 
resources to do this.