lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76a69a50-eeac-4cc4-8e43-0311280ece3a@huaweicloud.com>
Date: Wed, 17 Dec 2025 08:42:25 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Waiman Long <llong@...hat.com>, tj@...nel.org, hannes@...xchg.org,
 mkoutny@...e.com
Cc: cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
 lujialin4@...wei.com, chenridong@...wei.com
Subject: Re: [PATCH -next] cpuset: fix warning when disabling remote partition



On 2025/12/17 6:03, Waiman Long wrote:
> On 12/8/25 9:56 PM, Chen Ridong wrote:
>> On 2025/11/27 11:04, Chen Ridong wrote:
>>> From: Chen Ridong <chenridong@...wei.com>
>>>
>>> A warning was triggered as follows:
>>>
>>> WARNING: kernel/cgroup/cpuset.c:1651 at remote_partition_disable+0xf7/0x110
>>> RIP: 0010:remote_partition_disable+0xf7/0x110
>>> RSP: 0018:ffffc90001947d88 EFLAGS: 00000206
>>> RAX: 0000000000007fff RBX: ffff888103b6e000 RCX: 0000000000006f40
>>> RDX: 0000000000006f00 RSI: ffffc90001947da8 RDI: ffff888103b6e000
>>> RBP: ffff888103b6e000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000001 R11: ffff88810b2e2728 R12: ffffc90001947da8
>>> R13: 0000000000000000 R14: ffffc90001947da8 R15: ffff8881081f1c00
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00007f55c8bbe0b2 CR3: 000000010b14c000 CR4: 00000000000006f0
>>> Call Trace:
>>>   <TASK>
>>>   update_prstate+0x2d3/0x580
>>>   cpuset_partition_write+0x94/0xf0
>>>   kernfs_fop_write_iter+0x147/0x200
>>>   vfs_write+0x35d/0x500
>>>   ksys_write+0x66/0xe0
>>>   do_syscall_64+0x6b/0x390
>>>   entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>> RIP: 0033:0x7f55c8cd4887
> 
> Sorry for the late reply. I was in the Linux Plumbers Conference last week and so didn't time to
> fully review it.
> 
>>>
>>> Reproduction steps (on a 16-CPU machine):
>>>
>>>          # cd /sys/fs/cgroup/
>>>          # mkdir A1
>>>          # echo +cpuset > A1/cgroup.subtree_control
>>>          # echo "0-14" > A1/cpuset.cpus.exclusive
>>>          # mkdir A1/A2
>>>          # echo "0-14" > A1/A2/cpuset.cpus.exclusive
>>>          # echo "root" > A1/A2/cpuset.cpus.partition
>>>          # echo 0 > /sys/devices/system/cpu/cpu15/online
>>>          # echo member > A1/A2/cpuset.cpus.partition
>>>
>>> When CPU 15 is offlined, subpartitions_cpus gets cleared because no CPUs
>>> remain available for the top_cpuset, forcing partitions to share CPUs with
>>> the top_cpuset. In this scenario, disabling the remote partition triggers
>>> a warning stating that effective_xcpus is not a subset of
>>> subpartitions_cpus. Partitions should be invalidated in this case to
>>> inform users that the partition is now invalid(cpus are shared with
>>> top_cpuset).
> 
> This is real corner case as such a scenario should rarely happen in a real production environment.
> 
> 
>>>
>>> To fix this issue:
>>> 1. Only emit the warning only if subpartitions_cpus is not empty and the
>>>     effective_xcpus is not a subset of subpartitions_cpus.
>>> 2. During the CPU hotplug process, invalidate partitions if
>>>     subpartitions_cpus is empty.
>>>
>>> Fixes: 4449b1ce46bf ("cgroup/cpuset: Remove remote_partition_check() & make
>>> update_cpumasks_hier() handle remote partition")
>>> Signed-off-by: Chen Ridong <chenridong@...wei.com>
>>> ---
>>>   kernel/cgroup/cpuset.c | 21 ++++++++++++++++-----
>>>   1 file changed, 16 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>>> index fea577b4016a..fbe539d66d9b 100644
>>> --- a/kernel/cgroup/cpuset.c
>>> +++ b/kernel/cgroup/cpuset.c
>>> @@ -1648,7 +1648,14 @@ static int remote_partition_enable(struct cpuset *cs, int new_prs,
>>>   static void remote_partition_disable(struct cpuset *cs, struct tmpmasks *tmp)
>>>   {
>>>       WARN_ON_ONCE(!is_remote_partition(cs));
>>> -    WARN_ON_ONCE(!cpumask_subset(cs->effective_xcpus, subpartitions_cpus));
>>> +    /*
>>> +     * When a CPU is offlined, top_cpuset may end up with no available CPUs,
>>> +     * which should clear subpartitions_cpus. We should not emit a warning for this
>>> +     * scenario: the hierarchy is updated from top to bottom, so subpartitions_cpus
>>> +     * may already be cleared when disabling the partition.
>>> +     */
>>> +    WARN_ON_ONCE(!cpumask_subset(cs->effective_xcpus, subpartitions_cpus) &&
>>> +             !cpumask_empty(subpartitions_cpus));
>>>         spin_lock_irq(&callback_lock);
>>>       cs->remote_partition = false;
>>> @@ -3956,8 +3963,9 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks
>>> *tmp)
>>>       if (remote || (is_partition_valid(cs) && is_partition_valid(parent)))
>>>           compute_partition_effective_cpumask(cs, &new_cpus);
>>>   -    if (remote && cpumask_empty(&new_cpus) &&
>>> -        partition_is_populated(cs, NULL)) {
>>> +    if (remote && (cpumask_empty(subpartitions_cpus) ||
>>> +            (cpumask_empty(&new_cpus) &&
>>> +             partition_is_populated(cs, NULL)))) {
>>>           cs->prs_err = PERR_HOTPLUG;
>>>           remote_partition_disable(cs, tmp);
>>>           compute_effective_cpumask(&new_cpus, cs, parent);
>>> @@ -3970,9 +3978,12 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks
>>> *tmp)
>>>        * 1) empty effective cpus but not valid empty partition.
>>>        * 2) parent is invalid or doesn't grant any cpus to child
>>>        *    partitions.
>>> +     * 3) subpartitions_cpus is empty.
>>>        */
>>> -    if (is_local_partition(cs) && (!is_partition_valid(parent) ||
>>> -                tasks_nocpu_error(parent, cs, &new_cpus)))
>>> +    if (is_local_partition(cs) &&
>>> +        (!is_partition_valid(parent) ||
>>> +         tasks_nocpu_error(parent, cs, &new_cpus) ||
>>> +         cpumask_empty(subpartitions_cpus)))
>>>           partcmd = partcmd_invalidate;
>>>       /*
>>>        * On the other hand, an invalid partition root may be transitioned
>> Friendly ping.
> Reviewed-by: Waiman Long <longman@...hat.com
> 

Thanks.

-- 
Best regards,
Ridong


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ