lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20251119105749.1385946-1-sunshaojie@kylinos.cn>
Date: Wed, 19 Nov 2025 18:57:49 +0800
From: Sun Shaojie <sunshaojie@...inos.cn>
To: llong@...hat.com,
	chenridong@...weicloud.com,
	mkoutny@...e.com
Cc: cgroups@...r.kernel.org,
	hannes@...xchg.org,
	linux-kernel@...r.kernel.org,
	linux-kselftest@...r.kernel.org,
	shuah@...nel.org,
	tj@...nel.org,
	Sun Shaojie <sunshaojie@...inos.cn>
Subject: [PATCH v5] cpuset: Avoid invalidating sibling partitions on cpuset.cpus conflict.

Currently, when setting a cpuset's cpuset.cpus to a value that conflicts
with its sibling partition, the sibling's partition state becomes invalid.
However, this invalidation is often unnecessary. If the cpuset being
modified is exclusive, it should invalidate itself upon conflict.

This patch applies only to the following two cases:

Assume the machine has 4 CPUs (0-3).

   root cgroup
      /    \
    A1      B1

Case 1: A1 is exclusive, B1 is non-exclusive, set B1's cpuset.cpus

 Table 1.1: Before applying this patch
 Step                                       | A1's prstate | B1's prstate |
 #1> echo "0-1" > A1/cpuset.cpus            | member       | member       |
 #2> echo "root" > A1/cpuset.cpus.partition | root         | member       |
 #3> echo "0" > B1/cpuset.cpus              | root invalid | member       |

After step #3, A1 changes from "root" to "root invalid" because its CPUs
(0-1) overlap with those requested by B1 (0). However, B1 can actually
use CPUs 2-3(from B1's parent), so it would be more reasonable for A1 to
remain as "root."

 Table 1.2: After applying this patch
 Step                                       | A1's prstate | B1's prstate |
 #1> echo "0-1" > A1/cpuset.cpus            | member       | member       |
 #2> echo "root" > A1/cpuset.cpus.partition | root         | member       |
 #3> echo "0" > B1/cpuset.cpus              | root         | member       |

Case 2: Both A1 and B1 are exclusive, set B1's cpuset.cpus

 Table 2.1: Before applying this patch
 Step                                       | A1's prstate | B1's prstate |
 #1> echo "0-1" > A1/cpuset.cpus            | member       | member       |
 #2> echo "root" > A1/cpuset.cpus.partition | root         | member       |
 #3> echo "2" > B1/cpuset.cpus              | root         | member       |
 #4> echo "root" > B1/cpuset.cpus.partition | root         | root         |
 #5> echo "1-2" > B1/cpuset.cpus            | root invalid | root invalid |

After step #4, B1 can exclusively use CPU 2. Therefore, at step #5,
regardless of what conflicting value B1 writes to cpuset.cpus, it will
always have at least CPU 2 available. This makes it unnecessary to mark
A1 as "root invalid".

 Table 2.2: After applying this patch
 Step                                       | A1's prstate | B1's prstate |
 #1> echo "0-1" > A1/cpuset.cpus            | member       | member       |
 #2> echo "root" > A1/cpuset.cpus.partition | root         | member       |
 #3> echo "2" > B1/cpuset.cpus              | root         | member       |
 #4> echo "root" > B1/cpuset.cpus.partition | root         | root         |
 #5> echo "1-2" > B1/cpuset.cpus            | root         | root invalid |

In summary, regardless of how B1 configures its cpuset.cpus, there will
always be available CPUs in B1's cpuset.cpus.effective. Therefore, there
is no need to change A1 from "root" to "root invalid".

All other cases remain unaffected. For example, cgroup-v1.

Signed-off-by: Sun Shaojie <sunshaojie@...inos.cn>
---
 kernel/cgroup/cpuset.c                        | 19 +------------------
 .../selftests/cgroup/test_cpuset_prs.sh       |  7 ++++---
 2 files changed, 5 insertions(+), 21 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 52468d2c178a..f6a834335ebf 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2411,34 +2411,17 @@ static int cpus_allowed_validate_change(struct cpuset *cs, struct cpuset *trialc
 					struct tmpmasks *tmp)
 {
 	int retval;
-	struct cpuset *parent = parent_cs(cs);
 
 	retval = validate_change(cs, trialcs);
 
 	if ((retval == -EINVAL) && cpuset_v2()) {
-		struct cgroup_subsys_state *css;
-		struct cpuset *cp;
-
 		/*
 		 * The -EINVAL error code indicates that partition sibling
 		 * CPU exclusivity rule has been violated. We still allow
 		 * the cpumask change to proceed while invalidating the
-		 * partition. However, any conflicting sibling partitions
-		 * have to be marked as invalid too.
+		 * partition.
 		 */
 		trialcs->prs_err = PERR_NOTEXCL;
-		rcu_read_lock();
-		cpuset_for_each_child(cp, css, parent) {
-			struct cpumask *xcpus = user_xcpus(trialcs);
-
-			if (is_partition_valid(cp) &&
-			    cpumask_intersects(xcpus, cp->effective_xcpus)) {
-				rcu_read_unlock();
-				update_parent_effective_cpumask(cp, partcmd_invalidate, NULL, tmp);
-				rcu_read_lock();
-			}
-		}
-		rcu_read_unlock();
 		retval = 0;
 	}
 	return retval;
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
index a17256d9f88a..7d8941f65d84 100755
--- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
+++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh
@@ -388,10 +388,11 @@ TEST_MATRIX=(
 	"  C0-1:S+  C1      .    C2-3     .      P2     .      .     0 A1:0-1|A2:1 A1:P0|A2:P-2"
 	"  C0-1:S+ C1:P2    .    C2-3     P1     .      .      .     0 A1:0|A2:1 A1:P1|A2:P2 0-1|1"
 
-	# A non-exclusive cpuset.cpus change will invalidate partition and its siblings
+	# A non-exclusive cpuset.cpus change will not invalidate its siblings partition.
+	# An exclusive cpuset.cpus change will invalidate itself.
 	"  C0-1:P1   .      .    C2-3   C0-2     .      .      .     0 A1:0-2|B1:2-3 A1:P-1|B1:P0"
-	"  C0-1:P1   .      .  P1:C2-3  C0-2     .      .      .     0 A1:0-2|B1:2-3 A1:P-1|B1:P-1"
-	"   C0-1     .      .  P1:C2-3  C0-2     .      .      .     0 A1:0-2|B1:2-3 A1:P0|B1:P-1"
+	"  C0-1:P1   .      .  P1:C2-3  C0-2     .      .      .     0 A1:0-1|B1:2-3 A1:P-1|B1:P1"
+	"   C0-1     .      .  P1:C2-3  C0-2     .      .      .     0 A1:0-1|B1:2-3 A1:P0|B1:P1"
 
 	# cpuset.cpus can overlap with sibling cpuset.cpus.exclusive but not subsumed by it
 	"   C0-3     .      .    C4-5     X5     .      .      .     0 A1:0-3|B1:4-5"
-- 
2.25.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ