[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20250814161345.b2ddf7120dfcc420c3199e67@linux-foundation.org>
Date: Thu, 14 Aug 2025 16:13:45 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: <zhongjinji@...or.com>
Cc: <linux-mm@...ck.org>, <mhocko@...e.com>, <rientjes@...gle.com>,
<shakeel.butt@...ux.dev>, <npache@...hat.com>,
<linux-kernel@...r.kernel.org>, <tglx@...utronix.de>, <mingo@...hat.com>,
<peterz@...radead.org>, <dvhart@...radead.org>, <dave@...olabs.net>,
<andrealmeid@...lia.com>, <liam.howlett@...cle.com>, <liulu.liu@...or.com>,
<feng.han@...or.com>, Joel Savitz <jsavitz@...hat.com>, Thomas Gleixner
<tglx@...utronix.de>
Subject: Re: [PATCH v4 0/3] mm/oom_kill: Only delay OOM reaper for processes
using robust futexes
On Thu, 14 Aug 2025 21:55:52 +0800 <zhongjinji@...or.com> wrote:
> The OOM reaper quickly reclaims a process's memory when the system hits OOM,
> helping the system recover. Without the OOM reaper, if a process frozen by
> cgroup v1 is OOM killed, the victim's memory cannot be freed, leaving the
> system in a poor state. Even if the process is not frozen by cgroup v1,
> reclaiming victims' memory remains important, as having one more process
> working speeds up memory release.
>
> When processes holding robust futexes are OOM killed but waiters on those
> futexes remain alive, the robust futexes might be reaped before
> futex_cleanup() runs. This can cause the waiters to block indefinitely [1].
>
> To prevent this issue, the OOM reaper's work is delayed by 2 seconds [1]. Since
> many killed processes exit within 2 seconds, the OOM reaper rarely runs after
> this delay. However, robust futex users are few, so delaying OOM reap for all
> victims is unnecessary.
>
> If each thread's robust_list in a process is NULL, the process holds no robust
> futexes. For such processes, the OOM reaper should not be delayed. For
> processes holding robust futexes, to avoid issue [1], the OOM reaper must
> still be delayed.
>
> Patch 1 introduces process_has_robust_futex() to detect whether a process uses
> robust futexes. Patch 2 delays the OOM reaper only for processes holding robust
> futexes, improving OOM reaper performance. Patch 3 makes the OOM reaper and
> exit_mmap() traverse the maple tree in opposite orders to reduce PTE lock
> contention caused by unmapping the same vma.
This all sounds sensible, given that we appear to be stuck with the
2-second hack.
What prevents one of the process's threads from creating a robust mutex
after we've inspected it with process_has_robust_futex()?
Powered by blists - more mailing lists