linux-kernel - Re: [PATCH v1] cpuset: Avoid unnecessary partition invalidation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20251112094610.386299-1-sunshaojie@kylinos.cn>
Date: Wed, 12 Nov 2025 17:46:10 +0800
From: Sun Shaojie <sunshaojie@...inos.cn>
To: chenridong@...weicloud.com
Cc: longman@...hat.com,
	tj@...nel.org,
	hannes@...xchg.org,
	mkoutny@...e.com,
	shuah@...nel.org,
	cgroups@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v1] cpuset: Avoid unnecessary partition invalidation

Hi Ridong,

Thank you for your response.

>From your reply "in case 1, A1 can also be converted to a partition," I 
realize there might be a misunderstanding. The scenario I'm addressing 
involves two sibling cgroups where one is an effective partition root and 
the other is not, and both have empty cpuset.cpus.exclusive. Let me 
explain the intention behind case 1 in detail, which will also illustrate 
why this has negative impacts on our product.

In case 1, after #3 completes, A1 is already a valid partition root - this 
is correct.After #4, B1 was generated, and B1 is no-exclusive. After #5, 
A1 changes from "root" to "root invalid". But A1 becoming "root invalid" 
could be unnecessary because having A1 remain as "root" might be more 
acceptable. Here's the analysis:

As documented in cgroup-v2.rst regarding cpuset.cpus: "The actual list of 
CPUs to be granted, however, is subjected to constraints imposed by its 
parent and can differ from the requested CPUs". This means that although 
we're requesting CPUs 0-3 for B1, we can accept that the actual available 
CPUs in B1 might not be 0-3.

Based on this characteristic, in our product's implementation for case 1, 
before writing to B1's cpuset.cpus in #5, we check B1's parent 
cpuset.cpus.effective and know that the CPUs available for B1 don't include 
0-1 (since 0-1 are exclusively used by A1). However, we still want to set 
B1's cpuset.cpus to 0-3 because we hope that when 0-1 become available in 
the future, B1 can use them without affecting the normal operation of other 
cgroups.

The reality is that because B1's requested cpuset.cpus (0-3) conflicts with 
A1's exclusive CPUs (0-1) at that moment, it destroys the validity of A1's 
partition root. So why must the current rule sacrifice A1's validity to 
accommodate B1's CPU request? In this situation, B1 can clearly use 2-3 
while A1 exclusively uses 0-1 - they don't need to conflict.

This patch narrows the exclusivity conflict check scope to only between 
partitions. Moreover, user-specified CPUs (including cpuset.cpus and 
cpuset.cpus.exclusive) only have true exclusive meaning within effective 
partitions. So why should the current rule perform exclusivity conflict 
checks between an exclusive partition and a non-exclusive member? This is 
clearly unnecessary.

Thanks
Sun Shaojie