lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZtsYNOTA6ekEa6TE@arm.com>
Date: Fri, 6 Sep 2024 15:56:52 +0100
From: Catalin Marinas <catalin.marinas@....com>
To: Mark Brown <broonie@...nel.org>
Cc: Will Deacon <will@...nel.org>, linux-arm-kernel@...ts.infradead.org,
	linux-kernel@...r.kernel.org, Mark Rutland <mark.rutland@....com>
Subject: Re: [PATCH] arm64/fpsimd: Ensure we don't contend a SMCU from idling
 CPUs

On Thu, Sep 05, 2024 at 07:34:41PM +0100, Mark Brown wrote:
> On Thu, Sep 05, 2024 at 06:51:30PM +0100, Catalin Marinas wrote:
> 
> > OK, so likely the state is already saved, all we need to do here is
> > flush the state and SMSTOP. But why would switching to idle be any
> > different than switching to a thread that doesn't used SME? It feels
> > like we are just trying to optimise a special case only. Could we not
> > instead issue an SMSTOP in the context switch code?
> 
> On context switch the SMSTOP is issued as part of loading the state for
> the task but we only do that when either returning to userspace or it's
> a kernel thread with active FPSIMD usage.  The idle thread is a kernel
> thread with no FPSIMD usage so we don't touch the state.  If we did the
> SMSTOP unconditionally that'd mean that the optimisation where we don't
> reload the FP state if we bounce through a kernel thread would be broken
> while using SME which doesn't seem ideal, idling really does seem like a
> meaningfully special case here.

It depends on why the CPU is idling and we don't have the whole
information in this function. If it was a wait on a syscall, we already
discarded the state (but we only issue sme_smstop_sm() IIUC). With this
patch, we'd disable the ZA storage as well, can it cause any performance
issues by forcing the user to re-fault?

If it's some short-lived wait for I/O on page faults, we may not want to
disable streaming mode. I don't see this last case much different from
switching to a kernel thread that doesn't use SME.

So I think this leaves us with the case where a thread is migrated to a
different CPU and the current CPU goes into idle for longer. But, again,
we can't tell in the arch callback. The cpuidle driver calling into
firmware is slightly better informed since it knows it's been idle (or
going to be) for longer.

> > Also this looks hypothetical until we have some hardware to test it on,
> > see how it would behave with a shared SME unit.
> 
> The specific performance impacts will depend on hardware (there'll
> likely be some power impact even on things with a single FP unit per
> PE) but given that keeping SM and ZA disabled when not in use is a
> fairly strong recommendation in the programming model my inclination at
> this point would be to program to the advertised model until we have
> confirmation that the hardware actually behaves otherwise.

Does the programming model talk about shared units (I haven't read it,
not even sure where it is)? I hope one CPU cannot DoS another by not
issuing SMSTOPs and the hardware has some provisions for sharing that
guarantees forward progress on all CPUs. They may not be optimal but
it's highly depended on the software usage and hardware behaviour.

I'm inclined not to do anything at this stage until we see the actual
hardware behaviour in practice.

-- 
Catalin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ