lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b51e1230-d366-4d0f-adc8-fac01b5de655@oss.qualcomm.com>
Date: Fri, 12 Sep 2025 15:10:16 +0200
From: Konrad Dybcio <konrad.dybcio@....qualcomm.com>
To: Brian Norris <briannorris@...omium.org>
Cc: Bjorn Andersson <andersson@...nel.org>,
        Konrad Dybcio <konradybcio@...nel.org>,
        Georgi Djakov <djakov@...nel.org>,
        Odelu Kukatla <quic_okukatla@...cinc.com>,
        cros-qcom-dts-watchers@...omium.org,
        Conor Dooley <conor+dt@...nel.org>, linux-kernel@...r.kernel.org,
        linux-arm-msm@...r.kernel.org,
        Krzysztof Kozlowski <krzk+dt@...nel.org>,
        Rob Herring <robh@...nel.org>,
        Douglas Anderson <dianders@...omium.org>, devicetree@...r.kernel.org
Subject: Re: [PATCH v2 2/2] arm64: dts: qcom: sc7280: Drop aggre{1,2}_noc QOS
 clocks on Herobrine

On 9/11/25 9:00 PM, Brian Norris wrote:
> Hi Konrad,
> 
> On Tue, Sep 02, 2025 at 02:02:15PM +0200, Konrad Dybcio wrote:
>> On 8/26/25 12:55 AM, Brian Norris wrote:
>>> Ever since these two commits
>>>
>>>   fbd908bb8bc0 ("interconnect: qcom: sc7280: enable QoS configuration")
>>>   2b5004956aff ("arm64: dts: qcom: sc7280: Add clocks for QOS configuration")
>>>
>>> Herobrine systems fail to boot due to crashes like the following:
>>>
>>> [    0.243171] Kernel panic - not syncing: Asynchronous SError Interrupt
>>> [    0.243173] CPU: 7 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0 #1 c5464041cff584ced692726af2c4400fa2bde1db
>>> [    0.243178] Hardware name: Qualcomm Technologies, Inc. sc7280 CRD platform (rev5+) (DT)
>>> [    0.243180] Call trace:
>>> [    0.243182]  dump_backtrace+0x104/0x128
>>> [    0.243194]  show_stack+0x24/0x38
>>> [    0.243202]  __dump_stack+0x28/0x38
>>> [    0.243208]  dump_stack_lvl+0x28/0xb8
>>> [    0.243211]  dump_stack+0x18/0x30
>>> [    0.243215]  panic+0x134/0x340
>>> [    0.243219]  nmi_panic+0x48/0x98
>>> [    0.243227]  arm64_serror_panic+0x6c/0x80
>>> [    0.243245]  arm64_is_fatal_ras_serror+0xd8/0xe0
>>> [    0.243261]  do_serror+0x5c/0xa8
>>> [    0.243265]  el1h_64_error_handler+0x34/0x48
>>> [    0.243272]  el1h_64_error+0x7c/0x80
>>> [    0.243285]  regmap_mmio_read+0x5c/0xc0
>>> [    0.243289]  _regmap_bus_reg_read+0x78/0xf8
>>> [    0.243296]  regmap_update_bits_base+0xec/0x3a8
>>> [    0.243300]  qcom_icc_rpmh_probe+0x2d4/0x490
>>> [    0.243308]  platform_probe+0xb4/0xe0
>>> [...]
>>>
>>> Specifically, they fail in qcom_icc_set_qos() when trying to write the
>>> QoS settings for qhm_qup1. Several of the previous nodes (qhm_qspi,
>>> qhm_qup0, ...) seem to configure without crashing.
>>>
>>> We suspect that the TZ firmware on these devices does not expose QoS
>>> regions to Linux. The right solution here might involve deleting both
>>> 'clocks' and 'reg', but 'reg' would cause more problems. Linux is
>>> already OK with a missing 'clocks', since pre-2b5004956aff DTBs need to
>>> be supported, so we go with an easier solution.
>>
>> Just to make sure I'm reading this right - the clocks enable just fine,
>> but it's the writes to the QoS settings that trigger the hang?
> 
> Yes.
> 
>> Any chance skipping qhm_qup1 specifically makes things better?
> 
> Yes, it seems so. Or specifically, this diff:
> 
> --- a/drivers/interconnect/qcom/sc7280.c
> +++ b/drivers/interconnect/qcom/sc7280.c
> @@ -52,12 +52,6 @@ static struct qcom_icc_node qhm_qup1 = {
>  	.id = SC7280_MASTER_QUP_1,
>  	.channels = 1,
>  	.buswidth = 4,
> -	.qosbox = &(const struct qcom_icc_qosbox) {
> -		.num_ports = 1,
> -		.port_offsets = { 0x8000 },
> -		.prio = 2,
> -		.urg_fwd = 0,
> -	},
>  	.num_links = 1,
>  	.links = { SC7280_SLAVE_A1NOC_SNOC },
>  };

As I attempt to find a board that would boot with your sw stack,
could I ask you to check if commenting any of the three writes in

drivers/interconnect/qcom/icc-rpmh.c : qcom_icc_set_qos()

specifically causes the crash?

FWIW they're supposed to be independent so you don't have to test
all possible combinations

Konrad

> 
>> Could you please share your exact software version (which I assume is really
>> just the version of TF-A in this case) so I can try and reproduce it?
> 
> I'm not much of an expert on the makeup of QCOM firmware, but reading my
> firmware logs, that'd be:
> 
>   coreboot-v1.9308_26_0.0.22-32067-g641732a20a
> 
> and
> 
>   BL31: v2.8(debug):v2.8-776-g0223d1576
> 
> IIUC, the latter points to TF-A hash:
> 
>   0223d15764ed Merge "feat(docs): allow verbose build" into integration
> 
> Brian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ