lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250814135555.17493-1-zhongjinji@honor.com>
Date: Thu, 14 Aug 2025 21:55:52 +0800
From: <zhongjinji@...or.com>
To: <linux-mm@...ck.org>
CC: <akpm@...ux-foundation.org>, <mhocko@...e.com>, <rientjes@...gle.com>,
	<shakeel.butt@...ux.dev>, <npache@...hat.com>,
	<linux-kernel@...r.kernel.org>, <tglx@...utronix.de>, <mingo@...hat.com>,
	<peterz@...radead.org>, <dvhart@...radead.org>, <dave@...olabs.net>,
	<andrealmeid@...lia.com>, <liam.howlett@...cle.com>, <liulu.liu@...or.com>,
	<feng.han@...or.com>, <zhongjinji@...or.com>
Subject: [PATCH v4 0/3] mm/oom_kill: Only delay OOM reaper for processes using robust futexes

From: zhongjinji <zhongjinji@...or.com>

The OOM reaper quickly reclaims a process's memory when the system hits OOM,
helping the system recover. Without the OOM reaper, if a process frozen by
cgroup v1 is OOM killed, the victim's memory cannot be freed, leaving the
system in a poor state. Even if the process is not frozen by cgroup v1,
reclaiming victims' memory remains important, as having one more process
working speeds up memory release.

When processes holding robust futexes are OOM killed but waiters on those
futexes remain alive, the robust futexes might be reaped before
futex_cleanup() runs. This can cause the waiters to block indefinitely [1].

To prevent this issue, the OOM reaper's work is delayed by 2 seconds [1]. Since
many killed processes exit within 2 seconds, the OOM reaper rarely runs after
this delay. However, robust futex users are few, so delaying OOM reap for all
victims is unnecessary.

If each thread's robust_list in a process is NULL, the process holds no robust
futexes. For such processes, the OOM reaper should not be delayed. For
processes holding robust futexes, to avoid issue [1], the OOM reaper must
still be delayed.

Patch 1 introduces process_has_robust_futex() to detect whether a process uses
robust futexes. Patch 2 delays the OOM reaper only for processes holding robust
futexes, improving OOM reaper performance. Patch 3 makes the OOM reaper and
exit_mmap() traverse the maple tree in opposite orders to reduce PTE lock
contention caused by unmapping the same vma.

Link: https://lore.kernel.org/all/20220414144042.677008-1-npache@redhat.com/T/#u [1]

---

v3 -> v4:

1. Rename check_robust_futex() to process_has_robust_futex() for clearer
   intent.
2. Because the delay_reap parameter was added to task_will_free_mem(),
   the function is renamed to should_reap_task() to better clarify
   its purpose.
3. Add should_delay_oom_reap() to decide whether to delay OOM reap.
4. Modify the OOM reaper to traverse the maple tree in reverse order; see patch
   3 for details.
These changes improve code readability and enhance OOM reaper behavior.

zhongjinji (3):
  futex: Introduce function process_has_robust_futex()
  mm/oom_kill: Only delay OOM reaper for processes using robust futexes
  mm/oom_kill: Have the OOM reaper and exit_mmap() traverse the maple
    tree in opposite orders

 include/linux/futex.h |  5 ++++
 include/linux/mm.h    |  3 +++
 kernel/futex/core.c   | 30 +++++++++++++++++++++++
 mm/oom_kill.c         | 55 +++++++++++++++++++++++++++++++------------
 4 files changed, 78 insertions(+), 15 deletions(-)

-- 
2.17.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ