linux-kernel - Re: [[PATCH v2] 2/2] futex: Only delay OOM reaper for processes using robust futex

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250805131900.17075-1-zhongjinji@honor.com>
Date: Tue, 5 Aug 2025 21:19:00 +0800
From: zhongjinji <zhongjinji@...or.com>
To: <mhocko@...e.com>
CC: <akpm@...ux-foundation.org>, <andrealmeid@...lia.com>,
	<dave@...olabs.net>, <dvhart@...radead.org>, <feng.han@...or.com>,
	<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>, <liulu.liu@...or.com>,
	<mingo@...hat.com>, <npache@...hat.com>, <peterz@...radead.org>,
	<rientjes@...gle.com>, <shakeel.butt@...ux.dev>, <tglx@...utronix.de>,
	<zhongjinji@...or.com>
Subject: Re: [[PATCH v2] 2/2] futex: Only delay OOM reaper for processes using robust futex

>On Mon 04-08-25 19:50:37, zhongjinji wrote:
>> >On Fri 01-08-25 23:36:49, zhongjinji@...or.com wrote:
>> >> From: zhongjinji <zhongjinji@...or.com>
>> >> 
>> >> After merging the patch
>> >> https://lore.kernel.org/all/20220414144042.677008-1-npache@redhat.com/T/#u
>> >> the OOM reaper runs less frequently because many processes exit within 2 seconds.
>> >> 
>> >> However, when a process is killed, timely handling by the OOM reaper allows
>> >> its memory to be freed faster.
>> >> 
>> >> Since relatively few processes use robust futex, delaying the OOM reaper for
>> >> all processes is undesirable, as many killed processes cannot release memory
>> >> more quickly.
>> >
>> >Could you elaborate more about why this is really needed? OOM should be
>> >a very slow path. Why do you care about this potential improvement in
>> >that situation? In other words what is the usecase?
>> 
>> Well, We are using the cgroup v1 freezer. When a frozen process is
>> killed, it cannot exit immediately and is blocked in __refrigerator until
>> it is thawed. When the process cannot be thawed in time, it will result in 
>> increased system memory pressure.
>
>This is an important information to be part of the changelog! It is also

sorry, I will update those infos in next version.

>important to note why don't you care about processes that have robust
>mutexes. Is this purely a probabilistic improvement because those are
>less common?

Yes, My device runs Android. I added a log in futex_cleanup when a
process has a robust list, But I have never seen any process on Android
having robust mutexes.

>TBH I find this to be really hackish and justification based on cgroup
>v1 (which is considered legacy) doesn't make it particularly appealing.

It seems hackish to check the robust_list during the oom kill, and it
is also hard to see the relationship between the robust_list and the 
OOM killer from this change. However, it is indeed a simple way to
decide whether to delay the oom reaper.
Would it be better to use a function name like unreap_before_exit or
unreap_before_all_exit instead of check_robust_futex?