[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1af20802-7e3f-6f93-bec0-0d625533ce4a@arm.com>
Date: Thu, 11 Apr 2019 16:12:11 +0100
From: Julien Grall <julien.grall@....com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Dave Martin <Dave.Martin@....com>,
linux-arm-kernel@...ts.infradead.org,
linux-rt-users@...r.kernel.org, catalin.marinas@....com,
will.deacon@....com, ard.biesheuvel@...aro.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] arm64/fpsimd: Don't disable softirq when touching
FPSIMD/SVE state
Hi Sebastian,
On 4/5/19 4:42 PM, Sebastian Andrzej Siewior wrote:
> On 2019-04-05 16:17:50 [+0100], Julien Grall wrote:
>> Hi,
> Hi,
>
>>> A per-CPU lock? It has to be a raw_spinlock_t because a normal
>>> spin_lock() / local_lock() would allow scheduling and might be taken as
>>> part of the context switch or soon after.
>> raw_spinlock_t would not work here without disabling preemption. Otherwise
>> you may end up to recurse on the lock and therefore deadlock. But then it
>> raise the question of the usefulness of the lock here.
>>
>> However, I don't really understand why allowing the scheduling would be a
>> problem here. Is it a concern because we will waste cycle trying to
>> restore/save a context that will be scratched as soon as we release the
>> lock?
>
> If you hold the lock within the kernel thread and every kernel thread
> acquires it before doing any SIMD operations then you are good. It could
> be a sleeping lock. What happens if you hold the lock, are scheduled out
> and a user task is about to be scheduled? How do you force the kernel
> thread out / give up the FPU registers?
You would need the thread out to finish running the critical section
before the next thread to run. I was under the impression with sleeping
lock, the priority of the thread out would get bumped to finish quickly
the critical section.
I agree it means that you will waste time restoring registers that will
get trashed right after you leave the critical sections. This is
probably not really efficient.
Anyway, that was only a suggestion I haven't fully thought through. I
have no plan to do more work than this patch in the fpsimd context switch.
>
> That preempt_disable() + local_bh_disable() might not be the pretties
> thing but how bad is it actually?
From the measurement I did on non-RT, it is beneficial to keep the
softirq enabled (see [1]).
I don't have platform where FPSIMD is used in softirqs (or at least not
by default). So for testing purpose, I wrote a tasklet (based on
hrtimer) using the FPSIMD registers every 1ms if it is allowed (i.e
may_use_simd() returns true). I let it run for a while and notice that
the tasklet will be executed only 0.15% of the time when !may_use_simd().
Furthermore, from what I understood in this thread, there are few
limited use cases where FPSIMD will be used in softirqs. So it seems
better to me to avoid disabling softirqs at least in non-RT.
For the RT, aren't all the softirqs handled in a thread? So what would
be the benefits to disable softirqs if we already disable preemption?
In any case, this patch introduces new helpers (get_cpu_fpsimd_context
and put_cpu_fpsimd_context) to delimit regions using the HW FPSIMD. So
it would be easy to modify the behavior if we wanted too.
> Latency wise you can't schedule(). From RT point of view you need to
> enable preemption while going from page to page because of the possible
> kmap() or kmalloc() (on baldy aligned src/dst) with the crypto's
> page-walk code.
Make sense. However I don't think we can keep enable the preemption with
the current implementation of FPSIMD context switch. I noticed you
disabled crypto for Arm64 because of allocations, I have a todo to look
at what we can do.
Cheers,
[1] https://marc.info/?l=linux-rt-users&m=155499183812211&w=2
>
> Sebastian
>
--
Julien Grall
Powered by blists - more mailing lists