lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z8WKQJcPTQDIXaKD@hovoldconsulting.com>
Date: Mon, 3 Mar 2025 11:53:52 +0100
From: Johan Hovold <johan@...nel.org>
To: Cristian Marussi <cristian.marussi@....com>
Cc: Dan Carpenter <dan.carpenter@...aro.org>,
	Sibi Sankar <quic_sibis@...cinc.com>, sudeep.holla@....com,
	dmitry.baryshkov@...aro.org, maz@...nel.org,
	linux-kernel@...r.kernel.org, arm-scmi@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org, linux-arm-msm@...r.kernel.org,
	konradybcio@...nel.org
Subject: Re: [RFC V6 2/2] firmware: arm_scmi: Add quirk to bypass SCP fw bug

Hi Cristian,

On Thu, Feb 27, 2025 at 08:34:44AM +0000, Cristian Marussi wrote:
> On Wed, Feb 26, 2025 at 10:58:44AM +0100, Johan Hovold wrote:

> > Something like that, yes. :) I didn't try to implement it, but it seems
> > like it should be possible implement this is a way that keeps the quirk
> > handling isolated.
> 
> I hope next week to have a better look at this, in tne meantime just a
> few considerations....
> 
> Sooner or later we should have introduced some sort of quirk framework
> in SCMI to deal systematically with potentially out-of-spec FW, but as
> in the name, it should be some sort of framework where you have a table of
> quirks, related activation conditions and a few very well isolated points
> where the quirks are placed and take action if enabled...this does not
> seem the case here where instead an ad-hoc param is added to the function
> that needs to be quirked...this does not scale and will make the codebase
> a mess IMHO...

Sounds good. At least we have a good understanding now of how this
particular firmware is broken so it would be great if you could use
this as a test case for the implementation.

In summary, we need to force the use of a fast channel for
PERF_LEVEL_GET on these machines, or possibly fall back to the current
behaviour of only using the domain attribute to determine whether the
fast channels should be initialised.

The latter may allow for a less intrusive implementation even if we'd
still see:

	arm-scmi arm-scmi.0.auto: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:0] - ret:-95. Using regular messaging.
	arm-scmi arm-scmi.0.auto: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:1] - ret:-95. Using regular messaging.
	arm-scmi arm-scmi.0.auto: Failed to get FC for protocol 13 [MSG_ID:6 / RES_ID:2] - ret:-95. Using regular messaging.

when not supported for all messages (e.g. with the current firmware).

> Last but not least, these quirks 'activations' in the SCMI world should
> be driven mainly by the VENDOR/SUB-VENDOR/IMPLEMENTATION_VERS triple and
> only as a last resort by a platform compatible match...because the
> IMPLEMENTATION_VERSION, exposed by the FW and gathered via SCMI Base
> protocol, is completely under the vendor control so it can, and should, be
> used to identify broken FW-versions...indeed all the distinct SCMI protocols
> are anyway versioned elsewhere according to the spec, so there is no need to
> repeat SCMI protocol version here..I mean it is an IMPLEMENTATION version
> 
> As an example on a JUNO the SCP reference FW reports:
> 
> arm-scmi arm-scmi.1.auto: SCMI Protocol v2.0 'arm:arm' Firmware version 0x20f0000
> 
> where the FW version represent something like the FW release tag, I think...not
> really sure about the menaing our FW gys give to it, BUT definitely it is not
> a bare copy of the SCMI protocol version...
> 
> So...
> ...does the platfom-to-be-quirked at hand properly use the IMPLEMENTATION_VERSION
> flag in a sensible way ?
> IOW does that change between a bad and good (or less bad :D) version ?

I guess only Sibi and Qualcomm can answer that. Both machines I have
that suffer from this report:

	arm-scmi arm-scmi.0.auto: SCMI Protocol v2.0 'Qualcomm:' Firmware version 0x20000

and I'm not sure if any fixed firmware has made it out to the vendors
yet or if the version was bumped when this was fixed.

On the other hand, perhaps forcing fast channel initialisation for
PERF_LEVEL_GET on all 'Qualcomm' firmware would work (i.e. only based on
VENDOR).

> Because shooting with the platform 'compatible-gun' should be the last resort
> if all of the above does NOT happen in this beloved fw...
> 
> Anyway, after all of this babbling, I know, talk is cheap :D...so now I will shut
> up and see if I can prototype something generic to deal with quirks, possibly
> next week...

Much appreciated, thanks.

Johan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ