[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXFOS4n4HNCZuoSUT3KUs+pM6OqSYz3Pv5z1dmZJZ70meQ@mail.gmail.com>
Date: Sat, 20 Sep 2025 00:41:51 +0200
From: Ard Biesheuvel <ardb@...nel.org>
To: Eric Biggers <ebiggers@...nel.org>
Cc: Ard Biesheuvel <ardb+git@...gle.com>, linux-arm-kernel@...ts.infradead.org,
linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org,
herbert@...dor.apana.org.au, Marc Zyngier <maz@...nel.org>, Will Deacon <will@...nel.org>,
Mark Rutland <mark.rutland@....com>, Kees Cook <keescook@...omium.org>,
Catalin Marinas <catalin.marinas@....com>, Mark Brown <broonie@...nel.org>
Subject: Re: [PATCH 0/5] arm64: Move kernel mode FPSIMD buffer to the stack
On Fri, 19 Sept 2025 at 21:32, Eric Biggers <ebiggers@...nel.org> wrote:
>
> On Thu, Sep 18, 2025 at 08:35:40AM +0200, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@...nel.org>
> >
> > Move the buffer for preserving/restoring the kernel mode FPSIMD state on a
> > context switch out of struct thread_struct, and onto the stack, so that
> > the memory cost is not imposed needlessly on all tasks in the system.
> >
> > Patches #1 - #3 contains some prepwork so that patch #4 can tighten the
> > rules around permitted usage patterns of kernel_neon_begin() and
> > kernel_neon_end(). This permits #5 to provide a stack buffer to
> > kernel_neon_begin() transparently, in a manner that ensures that it will
> > remain available until after the associated call to kernel_neon_end()
> > returns.
> >
> > Cc: Marc Zyngier <maz@...nel.org>
> > Cc: Will Deacon <will@...nel.org>
> > Cc: Mark Rutland <mark.rutland@....com>
> > Cc: Kees Cook <keescook@...omium.org>
> > Cc: Catalin Marinas <catalin.marinas@....com>
> > Cc: Mark Brown <broonie@...nel.org>
> >
> > Ard Biesheuvel (5):
> > crypto/arm64: aes-ce-ccm - Avoid pointless yield of the NEON unit
> > crypto/arm64: sm4-ce-ccm - Avoid pointless yield of the NEON unit
> > crypto/arm64: sm4-ce-gcm - Avoid pointless yield of the NEON unit
> > arm64/fpsimd: Require kernel NEON begin/end calls from the same scope
> > arm64/fpsimd: Allocate kernel mode FP/SIMD buffers on the stack
> >
> > arch/arm64/crypto/aes-ce-ccm-glue.c | 5 +--
> > arch/arm64/crypto/sm4-ce-ccm-glue.c | 10 ++----
> > arch/arm64/crypto/sm4-ce-gcm-glue.c | 10 ++----
> > arch/arm64/include/asm/neon.h | 7 ++--
> > arch/arm64/include/asm/processor.h | 2 +-
> > arch/arm64/kernel/fpsimd.c | 34 +++++++++++++-------
> > 6 files changed, 34 insertions(+), 34 deletions(-)
>
> This looks like the right decision: saving 528 bytes per task is
> significant. 528 bytes is a lot to allocate on the stack too, but
> functions that use the NEON registers are either leaf functions or very
> close to being leaf functions, so it should be okay.
>
Indeed.
> The implementation is a bit unusual, though:
>
> #define kernel_neon_begin() do { __kernel_neon_begin(&(struct user_fpsimd_state){})
> #define kernel_neon_end() __kernel_neon_end(); } while (0)
>
> It works, but normally macros don't start or end code blocks behind the
> scenes like this.
That is kind of the point, as it restricts the use of them to an idiom
that guarantees that the stack variable lives long enough.
> Perhaps it should be more like s390's
> kernel_fpu_begin(), where the caller provides the buffer that the
> registers are stored in?
>
If we're happy to change the API on both arm64 and ARM, then we could
make it more explicit. It's a lot more work, though.
Powered by blists - more mailing lists