[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86361412-7de0-46bc-9188-a32b634e43a3@huaweicloud.com>
Date: Wed, 12 Nov 2025 09:58:01 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Waiman Long <llong@...hat.com>, Michal Koutný
<mkoutny@...e.com>
Cc: tj@...nel.org, hannes@...xchg.org, cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org, lujialin4@...wei.com, chenridong@...wei.com
Subject: Re: [PATCH next] cpuset: Treat tasks in attaching process as
populated
On 2025/11/12 4:35, Waiman Long wrote:
> On 11/11/25 2:25 PM, Michal Koutný wrote:
>> On Tue, Nov 11, 2025 at 10:16:33AM -0500, Waiman Long <llong@...hat.com> wrote:
>>> For internal helper like this one, we may not really need that as
>>> almost all the code in cpuset.c are within either a cpuset_mutex or
>>> callback_lock critical sections. So I am fine with or without it.
>> OK, cpuset_mutex and callback_lock are close but cgroup_is_populated()
>> that caught my eye would also need cgroup_mutex otherwise "the result
>> can only be used as a hint" (quote from cgroup.h).
>>
>> Or is it safe to assume that cpuset_mutex inside cpuset_attach() is
>> sufficient to always (incl. exits) ensure stability of
>> cgroup_is_populated() result?
>>
>> Anyway, I'd find some clarifications in the commit message or the
>> surrounding code about this helpful. (Judgment call, whether with a
>> lockdep macro. My opinion is -- why not.)
>
> For attach_in_progress, it is protected by the cpuset_mutex. So it may make sense to add a
> lockdep_assert_held() for that.
>
Will add.
> You are right that there are problems WRT the stability of cgroup_is_populated() value.
>
> I think "cgrp->nr_populated_csets + cs->attach_in_progress" should be almost stable for the cgroup
> itself with cpuset_mutex, but there can be a small timing window after cpuset_attach(), but before
> the stat is updated where the sum is 0, but there are actually tasks in the cgroup.
>
Do you mean there’s a small window after ss->attach (i.e., cpuset_attach) where
cgrp->nr_populated_csets + cs->attach_in_progress could be 0?
If I understand correctly:
ss->can_attach: cs->attach_in_progress++, sum > 0
css_set_move_task->css_set_update_populated: cgrp->nr_populated_csets++, sum > 0
ss->attach->cpuset_attach: cs->attach_in_progress--, sum > 0
What exactly is the small window you’re referring to where the sum equals 0?
> For "cgrp->nr_populated_domain_children + cgrp->nr_populated_threaded_children", it also has the
> problem that the sum can be 0 but there are attach_in_progress set in one or more of the child
> cgroups. So even with this patch, we can't guarantee 100% that there can be no task in the partition
> even if it has empty effective_cpus. It is only a problem for nested local partitions though as
> remote partitions are not allowed to exhaust all the CPUs from root cgroup.
>
> We should probably document that limitation to warn users if they try to create nested local
> partitions where the parent partition root of the child partitions has empty effective_cpus.
>
Hmm, but it was what the commit e2d59900d936 ("cgroup/cpuset: Allow no-task partition to have empty
cpuset.cpus.effective") allowed, and it makes sense.
--
Best regards,
Ridong
Powered by blists - more mailing lists