[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YnKh96isoB7jiFrv@zx2c4.com>
Date: Wed, 4 May 2022 17:55:35 +0200
From: "Jason A. Donenfeld" <Jason@...c4.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>,
LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
Filipe Manana <fdmanana@...e.com>, linux-crypto@...r.kernel.org
Subject: Re: [patch 3/3] x86/fpu: Make FPU protection more robust
Hi Thomas,
On Wed, May 04, 2022 at 05:36:38PM +0200, Thomas Gleixner wrote:
> But the only use case which utilizes FPU from hard interrupt context is
> the random generator via add_randomness_...().
>
> I did a benchmark of these functions, which invoke blake2s_update()
> three times in a row, on a SKL-X and a ZEN3. The generic code and the
> FPU accelerated code are pretty much on par vs. execution time of the
> algorithm itself plus/minus noise.
>
> IOW, using the FPU blindly for this kind of computations is not
> necessarily a good plan. I have no idea how these things are analyzed
> and evaluated if at all. Maybe the crypto people can shed some light on
> this.
drivers/net/wireguard/{noise,cookie}.c makes pretty heavy use of BLAKE2s
in hot paths where the FPU is already being used for other algorithms,
and so there the save/restore is worth it (assuming restore finally
works lazily). In benchmarks, the SIMD code made a real difference.
But this presumably regards mix_pool_bytes() in the RNG. If it turns out
that supporting the FPU in hard IRQ context is a major PITA, and the RNG
is the only thing making use of it, then sure, drop hard IRQ context
support for it. However... This may be unearthing a larger bug.
Sebastian and I put in a decent amount of work during 5.18 to remove all
calls to mix_pool_bytes() (and hence to blake2s_compress()) from
add_interrupt_randomness(). Have a look:
https://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git/tree/drivers/char/random.c#n1289
It now accumulates in some per-CPU buffer, and then every 64 interrupts
a worker runs that does the actual mix_pool_bytes() from kthread
context.
So the question is: what is still hitting mix_pool_bytes() from hard IRQ
context? I'll investigate a bit and see.
Jason
Powered by blists - more mailing lists