linux-kernel - Re: [PATCH v4 2/3] mm/oom_kill: Only delay OOM reaper for processes using robust futexes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aKRWtjRhE_HgFlp2@tiehlicka>
Date: Tue, 19 Aug 2025 12:49:26 +0200
From: Michal Hocko <mhocko@...e.com>
To: zhongjinji <zhongjinji@...or.com>
Cc: akpm@...ux-foundation.org, andrealmeid@...lia.com, dave@...olabs.net,
	dvhart@...radead.org, feng.han@...or.com, liam.howlett@...cle.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	liulu.liu@...or.com, mingo@...hat.com, npache@...hat.com,
	peterz@...radead.org, rientjes@...gle.com, shakeel.butt@...ux.dev,
	tglx@...utronix.de
Subject: Re: [PATCH v4 2/3] mm/oom_kill: Only delay OOM reaper for processes
 using robust futexes

On Mon 18-08-25 20:08:19, zhongjinji wrote:
> > On Thu 14-08-25 21:55:54, zhongjinji@...or.com wrote:
> > > From: zhongjinji <zhongjinji@...or.com>
> > > 
> > > The OOM reaper can quickly reap a process's memory when the system encounters
> > > OOM, helping the system recover. Without the OOM reaper, if a process frozen
> > > by cgroup v1 is OOM killed, the victims' memory cannot be freed, and the
> > > system stays in a poor state. Even if the process is not frozen by cgroup v1,
> > > reaping victims' memory is still meaningful, because having one more process
> > > working speeds up memory release.
> > > 
> > > When processes holding robust futexes are OOM killed but waiters on those
> > > futexes remain alive, the robust futexes might be reaped before
> > > futex_cleanup() runs. It would cause the waiters to block indefinitely.
> > > To prevent this issue, the OOM reaper's work is delayed by 2 seconds [1].
> > > The OOM reaper now rarely runs since many killed processes exit within 2
> > > seconds.
> > > 
> > > Because robust futex users are few, it is unreasonable to delay OOM reap for
> > > all victims. For processes that do not hold robust futexes, the OOM reaper
> > > should not be delayed and for processes holding robust futexes, the OOM
> > > reaper must still be delayed to prevent the waiters to block indefinitely [1].
> > > 
> > > Link: https://lore.kernel.org/all/20220414144042.677008-1-npache@redhat.com/T/#u [1]
> > 
> > What has happened to
> > https://lore.kernel.org/all/aJGiHyTXS_BqxoK2@tiehlicka/T/#u ?
> 
> If a process holding robust futexes gets frozen, robust futexes might be reaped before
> futex_cleanup() runs when an OOM occurs. I am not sure if this will actually happen.

Yes, and 2s delay will never rule that out. Especially for frozen tasks
which could be frozen undefinitely. That is not the point I have tried
to make. I was suggesting not treating futex specially because no matter
what we do this will always be racy and a hack to reduce the risk. We
simply cannot deal with that case more gracefully without a major
surgery to the futex implementation which is not desirable for this
specific reason.

So instead to checking for futex which Thomas was not happy about too
let's just reap _frozen_/_freezing_ tasks right away as that makes at
least some sense and it also handles your primary problem AFAIU.

> > Generally speaking it would be great to provide a link to previous
> > versions of the patchset. I do not see v3 in my inbox (which is quite
> > messy ATM so I might have easily missed it).
> 
> This is version v3, where I mainly fixed the error in the Subject prefix,
> changing it from futex to mm/oom_kill.
> 
> https://lore.kernel.org/all/20250804030341.18619-1-zhongjinji@honor.com/
> https://lore.kernel.org/all/20250804030341.18619-2-zhongjinji@honor.com/

please always mention that in the cover letter.

Thanks.
-- 
Michal Hocko
SUSE Labs