lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <02239162-b42c-43f0-82eb-9f4af8e96639@redhat.com>
Date: Sat, 27 Dec 2025 03:03:05 -0500
From: Waiman Long <llong@...hat.com>
To: Chen Ridong <chenridong@...weicloud.com>, Tejun Heo <tj@...nel.org>,
 Johannes Weiner <hannes@...xchg.org>, Michal Koutný
 <mkoutny@...e.com>, Shuah Khan <shuah@...nel.org>
Cc: linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
 linux-kselftest@...r.kernel.org, Sun Shaojie <sunshaojie@...inos.cn>
Subject: Re: [cgroup/for-6.20 PATCH 3/4] cgroup/cpuset: Don't fail cpuset.cpus
 change in v2

On 12/25/25 6:54 AM, Chen Ridong wrote:
>
> On 2025/12/25 15:30, Waiman Long wrote:
>> Commit fe8cd2736e75 ("cgroup/cpuset: Delay setting of CS_CPU_EXCLUSIVE
>> until valid partition") introduced a new check to disallow the setting
>> of a new cpuset.cpus.exclusive value that is a superset of a sibling's
>> cpuset.cpus value so that there will at least be one CPU left in the
>> sibling in case the cpuset becomes a valid partition root. This new
>> check does have the side effect of failing a cpuset.cpus change that
>> make it a subset of a sibling's cpuset.cpus.exclusive value.
>>
>> With v2, users are supposed to be allowed to set whatever value they
>> want in cpuset.cpus without failure. To maintain this rule, the check
>> is now restricted to only when cpuset.cpus.exclusive is being changed
>> not when cpuset.cpus is changed.
>>
>> Signed-off-by: Waiman Long <longman@...hat.com>
>> ---
>>   kernel/cgroup/cpuset.c | 30 +++++++++++++++---------------
>>   1 file changed, 15 insertions(+), 15 deletions(-)
>>
>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>> index 850334dbc36a..83bf6b588e5f 100644
>> --- a/kernel/cgroup/cpuset.c
>> +++ b/kernel/cgroup/cpuset.c
>> @@ -609,33 +609,31 @@ static inline bool cpusets_are_exclusive(struct cpuset *cs1, struct cpuset *cs2)
>>   
>>   /**
>>    * cpus_excl_conflict - Check if two cpusets have exclusive CPU conflicts
>> - * @cs1: first cpuset to check
>> - * @cs2: second cpuset to check
>> + * @trial:	the trial cpuset to be checked
>> + * @sibling:	a sibling cpuset to be checked against
>> + * @new_xcpus:	new exclusive_cpus in trial cpuset
>>    *
>>    * Returns: true if CPU exclusivity conflict exists, false otherwise
>>    *
>>    * Conflict detection rules:
>>    * 1. If either cpuset is CPU exclusive, they must be mutually exclusive
>>    * 2. exclusive_cpus masks cannot intersect between cpusets
>> - * 3. The allowed CPUs of one cpuset cannot be a subset of another's exclusive CPUs
>> + * 3. The allowed CPUs of a sibling cpuset cannot be a subset of the new exclusive CPUs
>>    */
>> -static inline bool cpus_excl_conflict(struct cpuset *cs1, struct cpuset *cs2)
>> +static inline bool cpus_excl_conflict(struct cpuset *trial, struct cpuset *sibling,
>> +				      bool new_xcpus)
>>   {
>>   	/* If either cpuset is exclusive, check if they are mutually exclusive */
>> -	if (is_cpu_exclusive(cs1) || is_cpu_exclusive(cs2))
>> -		return !cpusets_are_exclusive(cs1, cs2);
>> +	if (is_cpu_exclusive(trial) || is_cpu_exclusive(sibling))
>> +		return !cpusets_are_exclusive(trial, sibling);
>>   
>>   	/* Exclusive_cpus cannot intersect */
>> -	if (cpumask_intersects(cs1->exclusive_cpus, cs2->exclusive_cpus))
>> +	if (cpumask_intersects(trial->exclusive_cpus, sibling->exclusive_cpus))
>>   		return true;
>>   
>> -	/* The cpus_allowed of one cpuset cannot be a subset of another cpuset's exclusive_cpus */
>> -	if (!cpumask_empty(cs1->cpus_allowed) &&
>> -	    cpumask_subset(cs1->cpus_allowed, cs2->exclusive_cpus))
>> -		return true;
>> -
>> -	if (!cpumask_empty(cs2->cpus_allowed) &&
>> -	    cpumask_subset(cs2->cpus_allowed, cs1->exclusive_cpus))
>> +	/* The cpus_allowed of a sibling cpuset cannot be a subset of the new exclusive_cpus */
>> +	if (new_xcpus && !cpumask_empty(sibling->cpus_allowed) &&
>> +	    cpumask_subset(sibling->cpus_allowed, trial->exclusive_cpus))
>>   		return true;
>>   
>>   	return false;
>> @@ -672,6 +670,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
>>   {
>>   	struct cgroup_subsys_state *css;
>>   	struct cpuset *c, *par;
>> +	bool new_xcpus;
>>   	int ret = 0;
>>   
>>   	rcu_read_lock();
>> @@ -728,10 +727,11 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
>>   	 * overlap. exclusive_cpus cannot overlap with each other if set.
>>   	 */
>>   	ret = -EINVAL;
>> +	new_xcpus = !cpumask_equal(cur->exclusive_cpus, trial->exclusive_cpus);
>>   	cpuset_for_each_child(c, css, par) {
>>   		if (c == cur)
>>   			continue;
>> -		if (cpus_excl_conflict(trial, c))
>> +		if (cpus_excl_conflict(trial, c, new_xcpus))
>>   			goto out;
>>   		if (mems_excl_conflict(trial, c))
>>   			goto out;
> validate_change() is also called from cpuset_update_flag(), which may not change any cpus_allowed or
> exclusive_cpus. This could lead to incorrect checks.
>
> i.e,
>
> # cd /sys/fs/cgroup/
> # mkdir a
> # mkdir b
> # echo 1-2 > b/cpuset.cpus.exclusive  -- no conflict with a
> # echo 1 > a/cpuset.cpus
> # echo root > b/cpuset.cpus.partition  -- b becomes root partition, conflict with a, but
> exclusive_cpus unchanged
> # cat b/cpuset.cpus.partition
> root
>
> As a result, cpuset a (as a member) contains CPU 1, which is a subset of partition b's exclusive
> CPUs — a conflict that might be missed.

Yes, cpuset a has cpuset.cpus set to 1. In v2, cpuset.cpus can be set to 
any value but it doesn't mean that the parent will be able to give it to 
cpuset a. If you look at cpuset.cpus.effective, it will be the same as 
parent cpuset.cpus.effective, i.e. CPUs 1-2 will be absent. This is an 
expected behavior and there is nothing wrong.

Cheers,
Longman


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ