linux-kernel - Re: [PATCH 1/2] mm, oom: introduce oom reaper

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 2 Feb 2016 20:48:05 +0900
From:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To:	mhocko@...nel.org, rientjes@...gle.com
Cc:	akpm@...ux-foundation.org, mgorman@...e.de,
	penguin-kernel@...ove.sakura.ne.jp, torvalds@...ux-foundation.org,
	oleg@...hat.com, hughd@...gle.com, andrea@...nel.org,
	riel@...hat.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] mm, oom: introduce oom reaper

Michal Hocko wrote:
> > In this case, the oom reaper has ignored the next victim and doesn't do 
> > anything; the simple race has prevented it from zapping memory and does 
> > not reduce the livelock probability.
> > 
> > This can be solved either by queueing mm's to reap or involving the oom 
> > reaper into the oom killer synchronization itself.
> 
> as we have already discussed previously oom reaper is really tricky to
> be called from the direct OOM context. I will go with queuing. 
>  

OK. But it is not easy to build a reliable OOM-reap queuing chain. I think
that a dedicated kernel thread which does OOM-kill operation and OOM-reap
operation will be expected. That will also handle the "sleeping for too
long with oom_lock held after sending SIGKILL" problem.

> > I'm baffled by any reference to "memcg oom heavy loads", I don't 
> > understand this paragraph, sorry.  If a memcg is oom, we shouldn't be
> > disrupting the global runqueue by running oom_reaper at a high priority.  
> > The disruption itself is not only in first wakeup but also in how long the 
> > reaper can run and when it is rescheduled: for a lot of memory this is 
> > potentially long.  The reaper is best-effort, as the changelog indicates, 
> > and we shouldn't have a reliance on this high priority: oom kill exiting 
> > can't possibly be expected to be immediate.  This high priority should be 
> > removed so memcg oom conditions are isolated and don't affect other loads.
> 
> If this is a concern then I would be tempted to simply disable oom
> reaper for memcg oom altogether. For me it is much more important that
> the reaper, even though a best effort, is guaranteed to schedule if
> something goes terribly wrong on the machine.

I think that if something goes terribly wrong on the machine, a guarantee for
scheduling the reaper will not help unless we build a reliable queuing chain.
Building a reliable queuing chain will break some of assumptions provided by
current behavior. For me, a guarantee for scheduling for next OOM-kill
operation (with globally opening some or all of memory reserves) before
building a reliable queuing chain is much more important.

>                       But ohh well... I will queue up a patch to do this
> on top. I plan to repost the full patchset shortly.

Maybe we all agree with introducing OOM reaper without queuing, but I do
want to see a guarantee for scheduling for next OOM-kill operation before
trying to build a reliable queuing chain.