linux-kernel - Re: [PATCH next] sched,freezer: prevent tasks from escaping being frozen

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ae4bd50f-b3ac-45b8-8c95-f246e0c19641@huaweicloud.com>
Date: Tue, 8 Jul 2025 09:38:07 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Michal Koutný <mkoutny@...e.com>,
 Peter Zijlstra <peterz@...radead.org>, Tejun Heo <tj@...nel.org>
Cc: peterz@...radead.org, rafael@...nel.org, pavel@...nel.org,
 timvp@...gle.com, tj@...nel.org, linux-pm@...r.kernel.org,
 linux-kernel@...r.kernel.org, lujialin4@...wei.com, chenridong@...wei.com
Subject: Re: [PATCH next] sched,freezer: prevent tasks from escaping being
 frozen

On 2025/7/8 0:38, Michal Koutný wrote:
> On Fri, Jul 04, 2025 at 11:02:52AM +0800, Chen Ridong <chenridong@...weicloud.com> wrote:
>> Regarding your question: Did you mean that the task was frozen, received
>> another signal to wake up, but should have remained frozen instead of
>> entering the running state?
> 
> My response was a knee-jerk after seeing the move inside the loop.
> Since the task is in __refrigerator(), it would've been already frozen,
> so the 2nd (and more) checks should be OK and it wouldn't escape
> freezing despite the concurrent reader.
> 
> A task in __refrigerator() shouldn't be woken up by a signal, no? So
> your original conservative fix might have been sufficient afterall.
> (A conservative fix is what I'd strive for here, given it's the legacy
> cgroup freezer.)
> 
> Thanks,
> Michal

Thank you, Michal.

try_to_freeze
  if (likely(!freezing(current))) // This guarantees cgroup is not frozen.
	return false;
  __refrigerator
    freezing(current) // The cgroup can't be frozen unless 'current' is frozen.
		      // With the original logic, 'current' cannot escape freezing.

I agree that the original conservative fix is indeed sufficient

However, I'd like to highlight a behavioral change introduced by commit cff5f49d433f
("cgroup_freezer: cgroup_freezing: Check if not frozen"). Before this change, most callers of
freezing(p) would receive true when the cgroup was frozen, whereas now they receive false.

My concern is that the state where freezing(p) returns true (i.e., the cgroup is freezing but not
yet frozen) should be transient. I'm not entirely sure whether all callers of freezing(p) (except
__thaw_task) expect or handle this state correctly. Do the callers want the intermediate freezing
state? I am not sure...

For example, patch 37fb58a7273726e59f9429c89ade5116083a213d ("cgroup,freezer: fix incomplete
freezing when attaching tasks") appears to address an issue that might also be related to commit
cff5f49d433f. This makes me wonder if other callers of freezing(p) could be affected in ways we
haven't yet identified.

Given this, I'm considering whether we should revert commit cff5f49d433f, provided we can safely
remove the WARN_ON_ONCE(freezing(p)) check in __thaw_task. I'd appreciate your thoughts on this
approach.

Best regards,
Ridong