[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <975bf7c2-aa47-4ec2-b71f-d3e31644947d@huaweicloud.com>
Date: Thu, 13 Nov 2025 09:21:38 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Waiman Long <llong@...hat.com>, Sun Shaojie <sunshaojie@...inos.cn>
Cc: tj@...nel.org, hannes@...xchg.org, mkoutny@...e.com, shuah@...nel.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v1] cpuset: Avoid unnecessary partition invalidation
On 2025/11/13 2:05, Waiman Long wrote:
> On 11/12/25 4:46 AM, Sun Shaojie wrote:
>> Hi Ridong,
>>
>> Thank you for your response.
>>
>> From your reply "in case 1, A1 can also be converted to a partition," I
>> realize there might be a misunderstanding. The scenario I'm addressing
>> involves two sibling cgroups where one is an effective partition root and
>> the other is not, and both have empty cpuset.cpus.exclusive. Let me
>> explain the intention behind case 1 in detail, which will also illustrate
>> why this has negative impacts on our product.
>>
>> In case 1, after #3 completes, A1 is already a valid partition root - this
>> is correct.After #4, B1 was generated, and B1 is no-exclusive. After #5,
>> A1 changes from "root" to "root invalid". But A1 becoming "root invalid"
>> could be unnecessary because having A1 remain as "root" might be more
>> acceptable. Here's the analysis:
>>
>> As documented in cgroup-v2.rst regarding cpuset.cpus: "The actual list of
>> CPUs to be granted, however, is subjected to constraints imposed by its
>> parent and can differ from the requested CPUs". This means that although
>> we're requesting CPUs 0-3 for B1, we can accept that the actual available
>> CPUs in B1 might not be 0-3.
>>
>> Based on this characteristic, in our product's implementation for case 1,
>> before writing to B1's cpuset.cpus in #5, we check B1's parent
>> cpuset.cpus.effective and know that the CPUs available for B1 don't include
>> 0-1 (since 0-1 are exclusively used by A1). However, we still want to set
>> B1's cpuset.cpus to 0-3 because we hope that when 0-1 become available in
>> the future, B1 can use them without affecting the normal operation of other
>> cgroups.
>>
>> The reality is that because B1's requested cpuset.cpus (0-3) conflicts with
>> A1's exclusive CPUs (0-1) at that moment, it destroys the validity of A1's
>> partition root. So why must the current rule sacrifice A1's validity to
>> accommodate B1's CPU request? In this situation, B1 can clearly use 2-3
>> while A1 exclusively uses 0-1 - they don't need to conflict.
>>
>> This patch narrows the exclusivity conflict check scope to only between
>> partitions. Moreover, user-specified CPUs (including cpuset.cpus and
>> cpuset.cpus.exclusive) only have true exclusive meaning within effective
>> partitions. So why should the current rule perform exclusivity conflict
>> checks between an exclusive partition and a non-exclusive member? This is
>> clearly unnecessary.
>
> As I have said in the other thread, v2 exclusive cpuset checking follows the v1 rule. However, the
> behavior of setting cpuset.cpus differs between v1 and v2. In v1, setting cpuset.cpus can fail if
> there is some conflict. In v2, users are allow to set whatever value they want without failure, but
> the effective CPUs granted will be subjected to constraints and differ from cpuset.cpus. So in that
> sense, I think it makes sense to relax the exclusive cpuset check for v2, but we still need to keep
> the current v1 behavior. Please update your patch to do that.
>
> Cheers,
> Longman
>
Hi, Longman.
It did not fail to set cupset.cpus, but invalidated the sibling cpuset partition.
If we relax this rule, we should consider:
What I want to note is this: what if we run echo root > /sys/fs/cgroup/B1/cpuset.cpus.partition
after step #5? There’s no conflict check when enabling the partition.
--
Best regards,
Ridong
Powered by blists - more mailing lists