lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <618074b8-c685-40bb-b7d3-f309b30cd25a@sirena.org.uk>
Date: Wed, 30 Oct 2024 18:10:28 +0000
From: Mark Brown <broonie@...nel.org>
To: Mark Rutland <mark.rutland@....com>
Cc: Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will@...nel.org>, linux-arm-kernel@...ts.infradead.org,
	linux-kernel@...r.kernel.org, stable@...r.kernel.org,
	Andre.Przywara@....com
Subject: Re: [PATCH] arm64/signal: Avoid corruption of SME state when
 entering signal handler

On Wed, Oct 30, 2024 at 05:34:16PM +0000, Mark Rutland wrote:

> I originally just had a few comments on the commit message, but I
> believe I've found a logic issue in this patch, and more general issue
> throughout our FPSIMD/SVE/SME manipulation -- more details below.

I'm fairly sure there's at least one other issue lurking somewhere with
TIF_SVE tearing, yes.  I've not been able to get that to reproduce, and
I've probably stared at this code too much to see it by pure inspection
however it looks like you might've spotted the issue here.

> On Wed, Oct 23, 2024 at 10:31:24PM +0100, Mark Brown wrote:

> It would be nice to have the signature of the failure as well, e.g.

> | This is intermittently detected by the fp-stress test, which
> | intermittently reports "ZA-VL-*-*: Bad SVCR: 0".

That's a common one for timing reasons, but it does also manifest with
other outputs (eg, if we turn off ZA while trying to execute
instructions that access ZA).

> I don't think this is correct in the TIF_FOREIGN_FPSTATE case. We don't
> unbind the saved state from another CPU it might still be resident on,
> and so IIUC there's a race whereby the updates to the saved state can
> end up discarded:

...

> ... and either:

> * A subsequent return to userspace will see TIF_FOREIGN_FPSTATE is
>   clear and not restore the in-memory state.

> * A subsequent context-switch will see TIF_FOREIGN_FPSTATE is clear an 
>   save the (stale) HW state again.

> It looks like we have a similar pattern all over the place, e.g.  in
> do_sve_acc():

Yes, indeed - I think that's a separate bug caused by the recalcuation
of TIF_FOREIGN_FPSTATE.

> This is going to need a careful audit and a proper series of
> fixes that can be backported to stable.

It feels like a separate thing at any rate.  We can do a simple and
robust but performance impacting fix by having fpsimd_thread_switch()
only ever set TIF_FOREIGN_FPSTATE, never clear it.  That'd cause extra
reloads in the case where we switch to a thread but stay in kernel mode
which probably happens often enough to be palatable.

Otherwise I'm not sure it's *too* hard, TIF_FOREIGN_FPSTATE is a bit of
a giveaway for places that could have issues.

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ