[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9861c077-55c6-60f4-02ea-bd0138945c16@redhat.com>
Date: Thu, 26 Jan 2023 15:58:41 -0500
From: Waiman Long <longman@...hat.com>
To: Will Deacon <will@...nel.org>
Cc: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Valentin Schneider <vschneid@...hat.com>,
Phil Auld <pauld@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-kernel@...r.kernel.org, regressions@...ts.linux.dev,
regressions@...mhuis.info
Subject: Re: [PATCH v2] sched: Store restrict_cpus_allowed_ptr() call state
On 1/26/23 15:49, Waiman Long wrote:
> On 1/26/23 11:11, Will Deacon wrote:
>> On Tue, Jan 24, 2023 at 03:24:36PM -0500, Waiman Long wrote:
>>> On 1/24/23 14:48, Will Deacon wrote:
>>>> On Fri, Jan 20, 2023 at 09:17:49PM -0500, Waiman Long wrote:
>>>>> The user_cpus_ptr field was originally added by commit b90ca8badbd1
>>>>> ("sched: Introduce task_struct::user_cpus_ptr to track requested
>>>>> affinity"). It was used only by arm64 arch due to possible asymmetric
>>>>> CPU setup.
>>>>>
>>>>> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
>>>>> cpumask"), task_struct::user_cpus_ptr is repurposed to store user
>>>>> requested cpu affinity specified in the sched_setaffinity().
>>>>>
>>>>> This results in a performance regression in an arm64 system when
>>>>> booted
>>>>> with "allow_mismatched_32bit_el0" on the command-line. The arch
>>>>> code will
>>>>> (amongst other things) calls force_compatible_cpus_allowed_ptr() and
>>>>> relax_compatible_cpus_allowed_ptr() when exec()'ing a 32-bit or a
>>>>> 64-bit
>>>>> task respectively. Now a call to relax_compatible_cpus_allowed_ptr()
>>>>> will always result in a __sched_setaffinity() call whether there is a
>>>>> previous force_compatible_cpus_allowed_ptr() call or not.
>>>> I'd argue it's more than just a performance regression -- the affinity
>>>> masks are set incorrectly, which is a user visible thing
>>>> (i.e. sched_getaffinity() gives unexpected values).
>>> Can your elaborate a bit more on what you mean by getting unexpected
>>> sched_getaffinity() results? You mean the result is wrong after a
>>> relax_compatible_cpus_allowed_ptr(). Right?
>> Yes, as in the original report. If, on a 4-CPU system, I do the
>> following
>> with v6.1 and "allow_mismatched_32bit_el0" on the kernel cmdline:
>>
>> # for c in `seq 1 3`; do echo 0 >
>> /sys/devices/system/cpu/cpu$c/online; done
>> # yes > /dev/null &
>> [1] 334
>> # taskset -p 334
>> pid 334's current affinity mask: 1
>> # for c in `seq 1 3`; do echo 1 >
>> /sys/devices/system/cpu/cpu$c/online; done
>> # taskset -p 334
>> pid 334's current affinity mask: f
>>
>> but with v6.2-rc5 that last taskset invocation gives:
>>
>> pid 334's current affinity mask: 1
>>
>> so, yes, the performance definitely regresses, but that's because the
>> affinity mask is wrong!
>
> I see what you mean now. Hotplug doesn't work quite well now because
> user_cpus_ptr has been repurposed to store the value set of
> sched_setaffinity() but not the previous cpus_mask before
> force_compatible_cpus_allowed_ptr().
>
> One possible solution is to modify the hotplug related code to check
> for the cpus_allowed_restricted, and if set, check
> task_cpu_possible_mask() to see if the cpu can be added back to its
> cpus_mask. I will take a further look at that later.
Wait, I think the cpuset hotplug code should be able to restore the
right cpumask since task_cpu_possible_mask() is used there. Is cpuset
enabled? Does the test works without allow_mismatched_32bit_el0?
I think there may be a bug somewhere.
Cheers,
Longman
Powered by blists - more mailing lists