[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5fd6c945-9319-4bde-9c0b-3ab864da111c@intel.com>
Date: Wed, 26 Feb 2025 09:09:39 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: Eric Biggers <ebiggers@...nel.org>,
David Laight <david.laight.linux@...il.com>
Cc: Xiao Liang <shaw.leon@...il.com>, x86@...nel.org,
linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org,
Ard Biesheuvel <ardb@...nel.org>, Ben Greear <greearb@...delatech.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
Andy Lutomirski <luto@...nel.org>
Subject: Re: [RFC PATCH 1/2] x86/fpu: make kernel-mode FPU reliably usable in
softirqs
On 2/25/25 14:59, Eric Biggers wrote:
> If we had to save/restore a large number of vector registers in every crypto
> function call (not amortized to one save/restore per return to userspace), that
> would be a big performance problem.
I just did a quick trace on my laptop. Looks like I have two main
kernel_fpu_begin() users: LUKS and networking. They both very much seem
to do a bunch of kernel_fpu_begin() operations but very few actual XSAVEs:
26 : save_fpregs_to_fpstate <-kernel_fpu_begin_mask
818 : kernel_fpu_begin_mask <-crc32c_pcl_intel_update
4192 : kernel_fpu_begin_mask <-xts_encrypt_vaes_avx10_256
This is at least _one_ data point very much in favor of Eric's argument
here. It appears that that the cost of one XSAVE is amortized across a
bunch of kernel_fpu_begin()s.
Powered by blists - more mailing lists