lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250509142852.GA5845@willie-the-truck>
Date: Fri, 9 May 2025 15:28:53 +0100
From: Will Deacon <will@...nel.org>
To: Ryan Roberts <ryan.roberts@....com>
Cc: Mikołaj Lenczewski <miko.lenczewski@....com>,
	suzuki.poulose@....com, yang@...amperecomputing.com, corbet@....net,
	catalin.marinas@....com, jean-philippe@...aro.org,
	robin.murphy@....com, joro@...tes.org, akpm@...ux-foundation.org,
	paulmck@...nel.org, mark.rutland@....com, joey.gouly@....com,
	maz@...nel.org, james.morse@....com, broonie@...nel.org,
	oliver.upton@...ux.dev, baohua@...nel.org, david@...hat.com,
	ioworker0@...il.com, jgg@...pe.ca, nicolinc@...dia.com,
	mshavit@...gle.com, jsnitsel@...hat.com, smostafa@...gle.com,
	kevin.tian@...el.com, linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
	iommu@...ts.linux.dev
Subject: Re: [RESEND PATCH v6 1/3] arm64: Add BBM Level 2 cpu feature

On Fri, May 09, 2025 at 03:16:01PM +0100, Ryan Roberts wrote:
> On 09/05/2025 14:49, Will Deacon wrote:
> > On Tue, May 06, 2025 at 03:52:59PM +0100, Ryan Roberts wrote:
> >> On 06/05/2025 15:25, Will Deacon wrote:
> >>> On Mon, Apr 28, 2025 at 03:35:14PM +0000, Mikołaj Lenczewski wrote:
> >>>> +static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope)
> >>>> +{
> >>>> +	if (!IS_ENABLED(CONFIG_ARM64_BBML2_NOABORT))
> >>>> +		return false;
> >>>> +
> >>>> +	if (scope & SCOPE_SYSTEM) {
> >>>> +		int cpu;
> >>>> +
> >>>> +		/*
> >>>> +		 * We are a boot CPU, and must verify that all enumerated boot
> >>>> +		 * CPUs have MIDR values within our allowlist. Otherwise, we do
> >>>> +		 * not allow the BBML2 feature to avoid potential faults when
> >>>> +		 * the insufficient CPUs access memory regions using BBML2
> >>>> +		 * semantics.
> >>>> +		 */
> >>>> +		for_each_online_cpu(cpu) {
> >>>> +			if (!cpu_has_bbml2_noabort(cpu_read_midr(cpu)))
> >>>> +				return false;
> >>>> +		}
> >>>
> >>> This penalises large homogeneous systems and it feels unnecessary given
> >>> that we have the ability to check this per-CPU. 
> 
> In case you didn't spot it, cpu_read_midr() is not actually reading a remote
> cpu's midr. It's reading the cached midr in a per-cpu data structure. So I don't
> think this will be very expensive in reality. And it's only run once during boot...
> 
> static inline unsigned int cpu_read_midr(int cpu)
> {
> 	WARN_ON_ONCE(!cpu_online(cpu));
> 
> 	return per_cpu(cpu_data, cpu).reg_midr;
> }

I know it's not reading a remote MIDR, that would be crazy :)

But iterating over per-cpu variables sucks and we should be able to avoid
it because this code already knows how to detect features locally.

> 
> > Can you use
> >>> ARM64_CPUCAP_BOOT_CPU_FEATURE instead of ARM64_CPUCAP_SYSTEM_FEATURE
> >>> to solve this?
> >>
> >> We are trying to solve for the case where the boot CPU has BBML2 but a secondary
> >> CPU doesn't. (e.g. hetrogeneous system where boot CPU is big and secondary is
> >> little and does not advertise the feature. I can't remember if we proved there
> >> are real systems with this config - I have vague recollection that we did but my
> >> memory is poor...).
> >>
> >> My understanding is that for ARM64_CPUCAP_BOOT_CPU_FEATURE, "If the boot CPU
> >> has enabled this feature already, then every late CPU must have it". So that
> >> would exclude any secondary CPUs without BBML2 from coming online?
> > 
> > Damn, yes, you're right. However, it still feels horribly hacky to iterate
> > over the online CPUs in has_bbml2_noabort() -- the cpufeature framework
> > has the ability to query features locally and we should be able to use
> > that. We're going to want that should the architecture eventually decide
> > on something like BBML3 for this.
> 
> For BBML3, we're looking for a minimum value in the BBM field of the FFMR, and
> in that case the framework can handle it nicely with
> ARM64_CPUCAP_SYSTEM_FEATURE. But the framework doesn't have any support for the
> case where we need to look at all the midrs. So Suzuki came up with this method.
> 
> If/when we have BBML3 I was thinking we could retrospectively treat the CPUs in
> the midr list as having an erratum that they don't advertise BBML3 when they
> should (since the semantics are basically the same I expect/hope/have been
> trying to ensure), so we can just implement workarounds to make it look like
> they do have BBML3, then the framework does it's thing. Perhaps we can live with
> this hack until we get to that point?

I think if you want to go down that route, then this needs to be detected
locally on each CPU.

> > What we have with BBML2_NOABORT seems similar to an hwcap in that we only
> > support the capability if all CPUs have it (rejecting late CPUs without it
> > in that case) but we can live without it if not all of the early CPUs
> > have it. Unlikely hwcaps, though, we shouldn't be advertising this to
> > userspace and we can't derive the capability solely from the sanitised
> > system registers.
> 
> Agreed.
> 
> > 
> > I wonder if we could treat it like an erratum in some way instead? That
> > is, invert things so that CPUs which _don't_ have BBML2_NOABORT are
> > considered to have a "BBM_CONFLICT_ABORT" erratum (which we obviously
> > wouldn't shout about). Then we should be able to say:
> > 
> >   - If any of the early CPUs don't have BBML2_NOABORT, then the erratum
> >     would be enabled and we wouln't elide BBM.
> > 
> >   - If a late CPU doesn't have BBML2_NOABORT then it can't come online
> >     if the erratum isn't already enabled.
> 
> That's exactly the policy that this cludge provides. But it's using the midr to
> check if the CPU has BBML2_NOABORT. I'm not sure I follow your point about a
> "BBM_CONFLICT_ABORT" erratum?

I was hoping that it would mean that each CPU can independently determine
whether or not they have the erratum and then enable it as soon as they
detect it. That way, there's no need to iterate over all the early cores.

> I'm also at a massive disadvantage because I find the whole cpufeatures
> framework impenetrable :)
> 
> I'll ping Suzuki to see if he can chime in here.

Thanks,

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ