linux-kernel - Re: [PATCH v5 3/3] cgroup/cpuset: Keep user set cpus affinity

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c10e4f69-9951-6c38-6e28-fafcaec00d89@redhat.com>
Date:   Tue, 16 Aug 2022 18:11:03 -0400
From:   Waiman Long <longman@...hat.com>
To:     Tejun Heo <tj@...nel.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Zefan Li <lizefan.x@...edance.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Will Deacon <will@...nel.org>, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v5 3/3] cgroup/cpuset: Keep user set cpus affinity


On 8/16/22 16:15, Tejun Heo wrote:
> On Tue, Aug 16, 2022 at 03:27:34PM -0400, Waiman Long wrote:
>> +static int cpuset_set_cpus_allowed_ptr(struct task_struct *p,
>> +				       const struct cpumask *mask)
>> +{
>> +	cpumask_var_t new_mask;
>> +	int ret;
>> +
>> +	if (!READ_ONCE(p->user_cpus_ptr)) {
>> +		ret = set_cpus_allowed_ptr(p, mask);
>> +		/*
>> +		 * If user_cpus_ptr becomes set now, we are racing with
>> +		 * a concurrent sched_setaffinity(). So use the newly
>> +		 * set user_cpus_ptr and retry again.
>> +		 *
>> +		 * TODO: We cannot detect change in the cpumask pointed to
>> +		 * by user_cpus_ptr. We will have to add a sequence number
>> +		 * if such a race needs to be addressed.
>> +		 */
> This is too ugly and obviously broken. Let's please do it properly.

Actually, there is similar construct in __sched_setaffinity():

again:
         retval = __set_cpus_allowed_ptr(p, new_mask, SCA_CHECK);
         if (retval)
                 goto out_free_new_mask;

         cpuset_cpus_allowed(p, cpus_allowed);
         if (!cpumask_subset(new_mask, cpus_allowed)) {
                 /*
                  * We must have raced with a concurrent cpuset update.
                  * Just reset the cpumask to the cpuset's cpus_allowed.
                  */
                 cpumask_copy(new_mask, cpus_allowed);
                 goto again;
         }

It is hard to synchronize different subsystems atomically without 
running into locking issue. Let me think about what can be done in this 
case.

Is using a sequence number to check for race with retry good enough?

Cheers,
Longman