lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86361412-7de0-46bc-9188-a32b634e43a3@huaweicloud.com>
Date: Wed, 12 Nov 2025 09:58:01 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Waiman Long <llong@...hat.com>, Michal Koutný
 <mkoutny@...e.com>
Cc: tj@...nel.org, hannes@...xchg.org, cgroups@...r.kernel.org,
 linux-kernel@...r.kernel.org, lujialin4@...wei.com, chenridong@...wei.com
Subject: Re: [PATCH next] cpuset: Treat tasks in attaching process as
 populated



On 2025/11/12 4:35, Waiman Long wrote:
> On 11/11/25 2:25 PM, Michal Koutný wrote:
>> On Tue, Nov 11, 2025 at 10:16:33AM -0500, Waiman Long <llong@...hat.com> wrote:
>>> For internal helper like this one, we may not really need that as
>>> almost all the code in cpuset.c are within either a cpuset_mutex or
>>> callback_lock critical sections. So I am fine with or without it.
>> OK, cpuset_mutex and callback_lock are close but cgroup_is_populated()
>> that caught my eye would also need cgroup_mutex otherwise "the result
>> can only be used as a hint" (quote from cgroup.h).
>>
>> Or is it safe to assume that cpuset_mutex inside cpuset_attach() is
>> sufficient to always (incl. exits) ensure stability of
>> cgroup_is_populated() result?
>>
>> Anyway, I'd find some clarifications in the commit message or the
>> surrounding code about this helpful. (Judgment call, whether with a
>> lockdep macro. My opinion is -- why not.)
> 
> For attach_in_progress, it is protected by the cpuset_mutex. So it may make sense to add a
> lockdep_assert_held() for that.
> 

Will add.

> You are right that there are problems WRT the stability of cgroup_is_populated() value.
> 
> I think "cgrp->nr_populated_csets + cs->attach_in_progress" should be almost stable for the cgroup
> itself with cpuset_mutex, but there can be a small timing window after cpuset_attach(), but before
> the stat is updated where the sum is 0, but there are actually tasks in the cgroup.
> 
Do you mean there’s a small window after ss->attach (i.e., cpuset_attach) where
cgrp->nr_populated_csets + cs->attach_in_progress could be 0?

If I understand correctly:

ss->can_attach: cs->attach_in_progress++, sum > 0
css_set_move_task->css_set_update_populated: cgrp->nr_populated_csets++, sum > 0
ss->attach->cpuset_attach: cs->attach_in_progress--, sum > 0

What exactly is the small window you’re referring to where the sum equals 0?

> For "cgrp->nr_populated_domain_children + cgrp->nr_populated_threaded_children", it also has the
> problem that the sum can be 0 but there are attach_in_progress set in one or more of the child
> cgroups. So even with this patch, we can't guarantee 100% that there can be no task in the partition
> even if it has empty effective_cpus. It is only a problem for nested local partitions though as
> remote partitions are not allowed to exhaust all the CPUs from root cgroup.
> 
> We should probably document that limitation to warn users if they try to create nested local
> partitions where the parent partition root of the child partitions has empty effective_cpus.
> 

Hmm, but it was what the commit e2d59900d936 ("cgroup/cpuset: Allow no-task partition to have empty
cpuset.cpus.effective") allowed, and it makes sense.


-- 
Best regards,
Ridong


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ