[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9a2e6d5b-9b42-4f32-a8d2-552c2585cf0f@huaweicloud.com>
Date: Sat, 27 Dec 2025 18:10:40 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Waiman Long <longman@...hat.com>, Tejun Heo <tj@...nel.org>,
Johannes Weiner <hannes@...xchg.org>, Michal Koutný
<mkoutny@...e.com>, Shuah Khan <shuah@...nel.org>
Cc: linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
linux-kselftest@...r.kernel.org, Sun Shaojie <sunshaojie@...inos.cn>
Subject: Re: [cgroup/for-6.20 PATCH 2/4] cgroup/cpuset: Consistently compute
effective_xcpus in update_cpumasks_hier()
On 2025/12/25 15:30, Waiman Long wrote:
> Since commit f62a5d39368e ("cgroup/cpuset: Remove remote_partition_check()
> & make update_cpumasks_hier() handle remote partition"), the
> compute_effective_exclusive_cpumask() helper was extended to
> strip exclusive CPUs from siblings when computing effective_xcpus
> (cpuset.cpus.exclusive.effective). This helper was later renamed to
> compute_excpus() in commit 86bbbd1f33ab ("cpuset: Refactor exclusive
> CPU mask computation logic").
>
> This helper is supposed to be used consistently to compute
> effective_xcpus. However, there is an exception within the callback
> critical section in update_cpumasks_hier() when exclusive_cpus of a
> valid partition root is empty. This can cause effective_xcpus value to
> differ depending on where exactly it is last computed. Fix this by using
> compute_excpus() in this case to give a consistent result.
>
> Signed-off-by: Waiman Long <longman@...hat.com>
> ---
> kernel/cgroup/cpuset.c | 15 +++++----------
> 1 file changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 3d2d28f0fd03..850334dbc36a 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -2050,6 +2050,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
> struct cpuset *parent = parent_cs(cp);
> bool remote = is_remote_partition(cp);
> bool update_parent = false;
> + bool empty_xcpus;
>
> old_prs = new_prs = cp->partition_root_state;
>
> @@ -2160,20 +2161,14 @@ static void update_cpumasks_hier(struct cpuset *cs, struct tmpmasks *tmp,
> new_prs = cp->partition_root_state;
> }
>
> + empty_xcpus = cpumask_empty(cp->exclusive_cpus);
> spin_lock_irq(&callback_lock);
> cpumask_copy(cp->effective_cpus, tmp->new_cpus);
> cp->partition_root_state = new_prs;
> - if (!cpumask_empty(cp->exclusive_cpus) && (cp != cs))
> + if (((new_prs > 0) && empty_xcpus) ||
> + ((cp != cs) && !empty_xcpus))
> compute_excpus(cp, cp->effective_xcpus);
The current logic for determining when to recompute effective_xcpus is difficult to follow.
Can we simplify it as follows?
if(new_prs > 0)
compute_excpus(cp, cp->effective_xcpus);
else
reset_partition_data(cp);
This would make the intent clearer: if cp is a valid partition, we recompute its effective_xcpus;
otherwise, we reset the partition data.
> -
> - /*
> - * Make sure effective_xcpus is properly set for a valid
> - * partition root.
> - */
> - if ((new_prs > 0) && cpumask_empty(cp->exclusive_cpus))
> - cpumask_and(cp->effective_xcpus,
> - cp->cpus_allowed, parent->effective_xcpus);
> - else if (new_prs < 0)
> + if (new_prs < 0)
> reset_partition_data(cp);
> spin_unlock_irq(&callback_lock);
>
--
Best regards,
Ridong
Powered by blists - more mailing lists