[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8d38b4ea-6b55-4319-b330-2dc33ac5cd77@huaweicloud.com>
Date: Thu, 20 Nov 2025 21:45:16 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Sun Shaojie <sunshaojie@...inos.cn>
Cc: cgroups@...r.kernel.org, hannes@...xchg.org,
linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
llong@...hat.com, mkoutny@...e.com, shuah@...nel.org, tj@...nel.org
Subject: Re: [PATCH v5] cpuset: Avoid invalidating sibling partitions on
cpuset.cpus conflict.
On 2025/11/20 21:07, Sun Shaojie wrote:
> Hi, Ridong,
>
> On Thu, 20 Nov 2025 08:51:30, Chen Ridong wrote:
>> On 2025/11/19 18:57, Sun Shaojie wrote:
>>> kernel/cgroup/cpuset.c | 19 +------------------
>>> .../selftests/cgroup/test_cpuset_prs.sh | 7 ++++---
>>> 2 files changed, 5 insertions(+), 21 deletions(-)
>>>
>>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>>> index 52468d2c178a..f6a834335ebf 100644
>>> --- a/kernel/cgroup/cpuset.c
>>> +++ b/kernel/cgroup/cpuset.c
>>> @@ -2411,34 +2411,17 @@ static int cpus_allowed_validate_change(struct cpuset *cs, struct cpuset *trialc
>>> struct tmpmasks *tmp)
>>> {
>>> int retval;
>>> - struct cpuset *parent = parent_cs(cs);
>>>
>>> retval = validate_change(cs, trialcs);
>>>
>>> if ((retval == -EINVAL) && cpuset_v2()) {
>>> - struct cgroup_subsys_state *css;
>>> - struct cpuset *cp;
>>> -
>>> /*
>>> * The -EINVAL error code indicates that partition sibling
>>> * CPU exclusivity rule has been violated. We still allow
>>> * the cpumask change to proceed while invalidating the
>>> - * partition. However, any conflicting sibling partitions
>>> - * have to be marked as invalid too.
>>> + * partition.
>>> */
>>> trialcs->prs_err = PERR_NOTEXCL;
>>> - rcu_read_lock();
>>> - cpuset_for_each_child(cp, css, parent) {
>>> - struct cpumask *xcpus = user_xcpus(trialcs);
>>> -
>>> - if (is_partition_valid(cp) &&
>>> - cpumask_intersects(xcpus, cp->effective_xcpus)) {
>>> - rcu_read_unlock();
>>> - update_parent_effective_cpumask(cp, partcmd_invalidate, NULL, tmp);
>>> - rcu_read_lock();
>>> - }
>>> - }
>>> - rcu_read_unlock();
>>> retval = 0;
>>> }
>>> return retval;
>>
>> If we remove this logic, there is a scenario where the parent (a partition) could end up with empty
>> effective CPUs. This means the corresponding CS will also have empty effective CPUs and thus fail to
>> disable its siblings' partitions.
>
> I have carefully considered the scenario where parent effective CPUs are
> empty, which corresponds to the following two cases. (After apply this patch).
>
> root cgroup
> |
> A1
> / \
> A2 A3
>
> Case 1:
> Step:
> #1> echo "0-1" > A1/cpuset.cpus
> #2> echo "root" > A1/cpuset.cpus.partition
> #3> echo "0-1" > A2/cpuset.cpus
> #4> echo "root" > A2/cpuset.cpus.partition
>
> After step #4,
>
> | A1 | A2 | A3 |
> cpus_allowed | 0-1 | 0-1 | |
> effective_cpus | | 0-1 | |
> prstate | root | root | member |
>
> After step #4, A3's effective CPUs is empty.
>
That may be a corner case is unexpected.
> #5> echo "0-1" > A3/cpuset.cpus
>
If we create subdirectories (e.g., A4, A5, ...) under the A1 cpuset and then configure cpuset.cpus
for A1 (a common usage scenario), processes can no longer be migrated into these subdirectories (A4,
A5, ...) afterward. However, prior to your patch, this migration was allowed.
> After step #5,
>
> | A1 | A2 | A3 |
> cpus_allowed | 0-1 | 0-1 | 0-1 |
> effective_cpus | | 0-1 | |
> prstate | root | root | member |
>
> This patch affects step #5. After step #5, A3's effective CPUs is also empty.
> Since A3's effective CPUs can be empty before step #5 (setting cpuset.cpus),
> it is acceptable for them to remain empty after step #5. Moreover, if A3 is
> aware that its parent's effective CPUs are empty, it should understand that
> the CPUs it requests may not be granted.
>
> Case 2:
> Step:
> #1> echo "0-1" > A1/cpuset.cpus
> #2> echo "root" > A1/cpuset.cpus.partition
> #3> echo "0" > A2/cpuset.cpus
> #4> echo "root" > A2/cpuset.cpus.partition
> #5> echo "1" > A3/cpuset.cpus
> #6> echo "root" > A3/cpuset.cpus.partition
>
> After step #6,
>
> | A1 | A2 | A3 |
> cpus_allowed | 0-1 | 0 | 1 |
> effective_cpus | | 0 | 1 |
> prstate | root | root | root |
>
> #7> echo "0-1" > A3/cpuset.cpus
>
> After step #7,
>
> | A1 | A2 | A3 |
> cpus_allowed | 0-1 | 0 | 0-1 |
> effective_cpus | 1 | 0 | 1 |
> prstate | root | root | root invalid |
>
> This patch affects step #7. After step #7, A3 only affects itself, changing
> from "root" to "root invalid". However, since its effective CPUs remain 1
> both before and after step #7, it doesn't matter even if A2 is not invalidated.
>
> The purpose of this patch is to ensure that modifying cpuset.cpus does not
> disable its siblings' partitions.
>
>
> Thanks,
> Sun Shaojie
--
Best regards,
Ridong
Powered by blists - more mailing lists