[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1e2eef0a-4637-4b4f-aea5-71e3e519757d@redhat.com>
Date: Fri, 12 Dec 2025 23:58:53 -0500
From: Waiman Long <llong@...hat.com>
To: Michal Koutný <mkoutny@...e.com>,
Waiman Long <llong@...hat.com>
Cc: Sun Shaojie <sunshaojie@...inos.cn>, chenridong@...weicloud.com,
cgroups@...r.kernel.org, hannes@...xchg.org, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org, shuah@...nel.org, tj@...nel.org
Subject: Re: [PATCH v5] cpuset: Avoid invalidating sibling partitions on
cpuset.cpus conflict.
On 12/8/25 9:32 AM, Michal Koutný wrote:
> Hi Waiman.
>
> On Wed, Nov 26, 2025 at 02:43:50PM -0500, Waiman Long <llong@...hat.com> wrote:
>> Modification to cpumasks are all serialized by the cpuset_mutex. If you are
>> referring to 2 or more tasks doing parallel updates to various cpuset
>> control files of sibling cpusets, the results can actually vary depending on
>> the actual serialization results of those operations.
> I meant the latter when the difference in results when concurrent tasks
> do the update (e.g. two containers start in parallel), I don't see an
> issue with the race wrt consistency of in-kernel data. We're on the same
> page here.
>
>> One difference between cpuset.cpus and cpuset.cpus.exclusive is the fact
>> that operations on cpuset.cpus.exclusive can fail if the result is not
>> exclusive WRT sibling cpusets, but becoming a valid partition is guaranteed
>> unless none of the exclusive CPUs are passed down from the parent. The use
>> of cpuset.cpus.exclusive is required for creating remote partition.
>>
>> OTOH, changes to cpuset.cpus will never fail, but becoming a valid partition
>> root is not guaranteed and is limited to the creation of local partition
>> only.
>>
>> Does that answer your question?
> It does help my understanding. Do you envision that remote and local
> partitions should be used together (in one subtree)?
It should be rare to have both remote and local partition enabled in the
same system, though it is not disallowed. The local partition should
only be used on system that run a small number of applications with one
or just a few that need partition support. For systems that run a large
number of containerized applications like a Kubernetes managed system,
local partition cannot be used because of the way container management
is being done as the actual cgroups associated with a container can be a
bit far from the cgroup root. Remote partition was created for such a
use case where local partition will be used at all.
Cheers,
Longman
Powered by blists - more mailing lists