[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5e690981-2921-4b9f-9771-8afaa15018c8@huaweicloud.com>
Date: Thu, 20 Nov 2025 08:51:30 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Sun Shaojie <sunshaojie@...inos.cn>, llong@...hat.com, mkoutny@...e.com
Cc: cgroups@...r.kernel.org, hannes@...xchg.org,
linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
shuah@...nel.org, tj@...nel.org
Subject: Re: [PATCH v5] cpuset: Avoid invalidating sibling partitions on
cpuset.cpus conflict.
On 2025/11/19 18:57, Sun Shaojie wrote:
> Currently, when setting a cpuset's cpuset.cpus to a value that conflicts
> with its sibling partition, the sibling's partition state becomes invalid.
> However, this invalidation is often unnecessary. If the cpuset being
> modified is exclusive, it should invalidate itself upon conflict.
>
> This patch applies only to the following two cases:
>
> Assume the machine has 4 CPUs (0-3).
>
> root cgroup
> / \
> A1 B1
>
> Case 1: A1 is exclusive, B1 is non-exclusive, set B1's cpuset.cpus
>
> Table 1.1: Before applying this patch
> Step | A1's prstate | B1's prstate |
> #1> echo "0-1" > A1/cpuset.cpus | member | member |
> #2> echo "root" > A1/cpuset.cpus.partition | root | member |
> #3> echo "0" > B1/cpuset.cpus | root invalid | member |
>
> After step #3, A1 changes from "root" to "root invalid" because its CPUs
> (0-1) overlap with those requested by B1 (0). However, B1 can actually
> use CPUs 2-3(from B1's parent), so it would be more reasonable for A1 to
> remain as "root."
>
> Table 1.2: After applying this patch
> Step | A1's prstate | B1's prstate |
> #1> echo "0-1" > A1/cpuset.cpus | member | member |
> #2> echo "root" > A1/cpuset.cpus.partition | root | member |
> #3> echo "0" > B1/cpuset.cpus | root | member |
>
> Case 2: Both A1 and B1 are exclusive, set B1's cpuset.cpus
>
> Table 2.1: Before applying this patch
> Step | A1's prstate | B1's prstate |
> #1> echo "0-1" > A1/cpuset.cpus | member | member |
> #2> echo "root" > A1/cpuset.cpus.partition | root | member |
> #3> echo "2" > B1/cpuset.cpus | root | member |
> #4> echo "root" > B1/cpuset.cpus.partition | root | root |
> #5> echo "1-2" > B1/cpuset.cpus | root invalid | root invalid |
>
> After step #4, B1 can exclusively use CPU 2. Therefore, at step #5,
> regardless of what conflicting value B1 writes to cpuset.cpus, it will
> always have at least CPU 2 available. This makes it unnecessary to mark
> A1 as "root invalid".
>
> Table 2.2: After applying this patch
> Step | A1's prstate | B1's prstate |
> #1> echo "0-1" > A1/cpuset.cpus | member | member |
> #2> echo "root" > A1/cpuset.cpus.partition | root | member |
> #3> echo "2" > B1/cpuset.cpus | root | member |
> #4> echo "root" > B1/cpuset.cpus.partition | root | root |
> #5> echo "1-2" > B1/cpuset.cpus | root | root invalid |
>
> In summary, regardless of how B1 configures its cpuset.cpus, there will
> always be available CPUs in B1's cpuset.cpus.effective. Therefore, there
> is no need to change A1 from "root" to "root invalid".
>
> All other cases remain unaffected. For example, cgroup-v1.
>
> Signed-off-by: Sun Shaojie <sunshaojie@...inos.cn>
> ---
> kernel/cgroup/cpuset.c | 19 +------------------
> .../selftests/cgroup/test_cpuset_prs.sh | 7 ++++---
> 2 files changed, 5 insertions(+), 21 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 52468d2c178a..f6a834335ebf 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -2411,34 +2411,17 @@ static int cpus_allowed_validate_change(struct cpuset *cs, struct cpuset *trialc
> struct tmpmasks *tmp)
> {
> int retval;
> - struct cpuset *parent = parent_cs(cs);
>
> retval = validate_change(cs, trialcs);
>
> if ((retval == -EINVAL) && cpuset_v2()) {
> - struct cgroup_subsys_state *css;
> - struct cpuset *cp;
> -
> /*
> * The -EINVAL error code indicates that partition sibling
> * CPU exclusivity rule has been violated. We still allow
> * the cpumask change to proceed while invalidating the
> - * partition. However, any conflicting sibling partitions
> - * have to be marked as invalid too.
> + * partition.
> */
> trialcs->prs_err = PERR_NOTEXCL;
> - rcu_read_lock();
> - cpuset_for_each_child(cp, css, parent) {
> - struct cpumask *xcpus = user_xcpus(trialcs);
> -
> - if (is_partition_valid(cp) &&
> - cpumask_intersects(xcpus, cp->effective_xcpus)) {
> - rcu_read_unlock();
> - update_parent_effective_cpumask(cp, partcmd_invalidate, NULL, tmp);
> - rcu_read_lock();
> - }
> - }
> - rcu_read_unlock();
> retval = 0;
> }
> return retval;
If we remove this logic, there is a scenario where the parent (a partition) could end up with empty
effective CPUs. This means the corresponding CS will also have empty effective CPUs and thus fail to
disable its siblings' partitions.
--
Best regards,
Ridong
Powered by blists - more mailing lists