lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <43da8f9d-f0fd-d67b-7384-fc03ad159f29@redhat.com>
Date:   Fri, 27 Jan 2023 09:54:26 -0500
From:   Waiman Long <longman@...hat.com>
To:     Will Deacon <will@...nel.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Phil Auld <pauld@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] sched: Store restrict_cpus_allowed_ptr() call state

On 1/27/23 07:59, Will Deacon wrote:
> Hi Waiman,
>
> On Thu, Jan 26, 2023 at 08:55:27PM -0500, Waiman Long wrote:
>> The user_cpus_ptr field was originally added by commit b90ca8badbd1
>> ("sched: Introduce task_struct::user_cpus_ptr to track requested
>> affinity"). It was used only by arm64 arch due to possible asymmetric
>> CPU setup.
>>
>> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>> cpumask"), task_struct::user_cpus_ptr is repurposed to store user
>> requested cpu affinity specified in the sched_setaffinity().
>>
>> This results in a slight performance regression on an arm64
>> system when booted with "allow_mismatched_32bit_el0"
>> on the command-line.  The arch code will (amongst
>> other things) calls force_compatible_cpus_allowed_ptr() and
>> relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a 64-bit
>> task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
>> will always result in a __sched_setaffinity() call whether there is a
>> previous force_compatible_cpus_allowed_ptr() call or not.
>>
>> In order to fix this regression, a new scheduler flag
>> task_struct::cpus_allowed_restricted is now added to track if
>> force_compatible_cpus_allowed_ptr() has been called before or not. This
>> patch also updates the comments in force_compatible_cpus_allowed_ptr()
>> and relax_compatible_cpus_allowed_ptr() and handles their interaction
>> with sched_setaffinity().
>>
>> This patch also removes the task_user_cpus() helper. In the case of
>> relax_compatible_cpus_allowed_ptr(), cpu_possible_mask as user_cpu_ptr
>> masking will be performed within __sched_setaffinity() anyway.
>>
>> Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask")
>> Reported-by: Will Deacon <will@...nel.org>
>> Signed-off-by: Waiman Long <longman@...hat.com>
>> ---
>>   include/linux/sched.h |  3 +++
>>   kernel/sched/core.c   | 25 +++++++++++++++++--------
>>   kernel/sched/sched.h  |  8 +-------
>>   3 files changed, 21 insertions(+), 15 deletions(-)
> So this doesn't even build...
>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index bb1ee6d7bdde..d7bc809c109e 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -2999,6 +2999,10 @@ static int __set_cpus_allowed_ptr(struct task_struct *p,
>>   	struct rq *rq;
>>   
>>   	rq = task_rq_lock(p, &rf);
>> +
>> +	if (ctx->flags & SCA_CLR_RESTRICT)
>> +		p->cpus_allowed_restricted = 0;
>> +
>>   	/*
>>   	 * Masking should be skipped if SCA_USER or any of the SCA_MIGRATE_*
>>   	 * flags are set.
>> @@ -3025,8 +3029,8 @@ EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
>>   /*
>>    * Change a given task's CPU affinity to the intersection of its current
>>    * affinity mask and @subset_mask, writing the resulting mask to @new_mask.
>> - * If user_cpus_ptr is defined, use it as the basis for restricting CPU
>> - * affinity or use cpu_online_mask instead.
>> + * The cpus_allowed_restricted bit is set to indicate to a later
>> + * relax_compatible_cpus_allowed_ptr() call to relax the cpumask.
>>    *
>>    * If the resulting mask is empty, leave the affinity unchanged and return
>>    * -EINVAL.
>> @@ -3044,6 +3048,7 @@ static int restrict_cpus_allowed_ptr(struct task_struct *p,
>>   	int err;
>>   
>>   	rq = task_rq_lock(p, &rf);
>> +	p->cpus_allowed_restricted = 1;
>>   
>>   	/*
>>   	 * Forcefully restricting the affinity of a deadline task is
>> @@ -3055,7 +3060,8 @@ static int restrict_cpus_allowed_ptr(struct task_struct *p,
>>   		goto err_unlock;
>>   	}
>>   
>> -	if (!cpumask_and(new_mask, task_user_cpus(p), subset_mask)) {
>> +	if (p->user_cpu_ptr &&
>> +	    !cpumask_and(new_mask, p->user_cpu_ptr, subset_mask)) {
> s/user_cpu_ptr/user_cpus_ptr/
>
>>   		err = -EINVAL;
>>   		goto err_unlock;
>>   	}
>> @@ -3069,9 +3075,8 @@ static int restrict_cpus_allowed_ptr(struct task_struct *p,
>>   
>>   /*
>>    * Restrict the CPU affinity of task @p so that it is a subset of
>> - * task_cpu_possible_mask() and point @p->user_cpus_ptr to a copy of the
>> - * old affinity mask. If the resulting mask is empty, we warn and walk
>> - * up the cpuset hierarchy until we find a suitable mask.
>> + * task_cpu_possible_mask(). If the resulting mask is empty, we warn
>> + * and walk up the cpuset hierarchy until we find a suitable mask.
>>    */
>>   void force_compatible_cpus_allowed_ptr(struct task_struct *p)
>>   {
>> @@ -3125,11 +3130,15 @@ __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx);
>>   void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
>>   {
>>   	struct affinity_context ac = {
>> -		.new_mask  = task_user_cpus(p),
>> -		.flags     = 0,
>> +		.new_mask  = cpu_possible_mask;
> s/;/,/
>
> But even with those two things fixed, I'm seeing new failures in my
> testing which I think are because restrict_cpus_allowed_ptr() is failing
> unexpectedly when called by force_compatible_cpus_allowed_ptr().
>
> For example, just running a 32-bit task on an asymmetric system results
> in:
>
> $ ./hello32
> [ 1690.855341] Overriding affinity for process 580 (hello32) to CPUs 2-3
>
> That then has knock-on effects such as losing track of the initial affinity
> mask and not being able to restore it if the forcefully-affined 32-bit task
> exec()s a 64-bit program.

I thought I have fixed the build failure. Apparently it is still there. 
I will fix it.

BTW, which arm64 cpus support "allow_mismatched_32bit_el0"? I am trying 
to see if I can reproduce the issue, but I am not sure if I have any 
access to the cpus that have this capability.

Cheers,
Longman

Powered by blists - more mailing lists