lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86o72z10b6.wl-maz@kernel.org>
Date: Fri, 01 Nov 2024 14:43:57 +0000
From: Marc Zyngier <maz@...nel.org>
To: Johan Hovold <johan@...nel.org>
Cc: Sibi Sankar <quic_sibis@...cinc.com>,
	sudeep.holla@....com,
	cristian.marussi@....com,
	andersson@...nel.org,
	konrad.dybcio@...aro.org,
	robh+dt@...nel.org,
	krzysztof.kozlowski+dt@...aro.org,
	dmitry.baryshkov@...aro.org,
	linux-kernel@...r.kernel.org,
	linux-arm-msm@...r.kernel.org,
	devicetree@...r.kernel.org,
	quic_rgottimu@...cinc.com,
	quic_kshivnan@...cinc.com,
	conor+dt@...nel.org,
	quic_nkela@...cinc.com,
	quic_psodagud@...cinc.com,
	abel.vesa@...aro.org
Subject: Re: [PATCH V7 0/2] qcom: x1e80100: Enable CPUFreq

On Fri, 01 Nov 2024 14:19:54 +0000,
Johan Hovold <johan@...nel.org> wrote:
> 
> On Fri, Nov 01, 2024 at 02:08:24PM +0000, Marc Zyngier wrote:
> 
> > I'm seeing similar things indeed. Randomly grepping in cpufreq/policy*
> > results in hard resets, although I don't get much on the serial
> > console when that happens. Interestingly, I also see some errors in
> > dmesg at boot time:
> > 
> > maz@...i-fraudulent:~$ dmesg| grep -i scmi
> > [    0.966175] scmi_core: SCMI protocol bus registered
> > [    7.929710] arm-scmi arm-scmi.2.auto: Using scmi_mailbox_transport
> > [    7.939059] arm-scmi arm-scmi.2.auto: SCMI max-rx-timeout: 30ms
> > [    7.945567] arm-scmi arm-scmi.2.auto: SCMI RAW Mode initialized for instance 0
> > [    7.958348] arm-scmi arm-scmi.2.auto: SCMI RAW Mode COEX enabled !
> > [    7.978303] arm-scmi arm-scmi.2.auto: SCMI Notifications - Core Enabled.
> > [    7.985351] arm-scmi arm-scmi.2.auto: SCMI Protocol v2.0 'Qualcomm:' Firmware version 0x20000
> > [    8.033774] arm-scmi arm-scmi.2.auto: Failed to add opps_by_lvl at 3801600 for NCC - ret:-16
> > [    8.033902] arm-scmi arm-scmi.2.auto: Failed to add opps_by_lvl at 3801600 for NCC - ret:-16
> > [    8.036528] arm-scmi arm-scmi.2.auto: Failed to add opps_by_lvl at 3801600 for NCC - ret:-16
> > [    8.036744] arm-scmi arm-scmi.2.auto: Failed to add opps_by_lvl at 3801600 for NCC - ret:-16
> > [    8.171232] scmi-perf-domain scmi_dev.4: Initialized 3 performance domains
> > 
> > All these "Failed" are a bit worrying. Happy to put any theory to the
> > test.
> 
> Yes, those warnings indeed look troubling. Fortunately they appear to be
> mostly benign and only indicate that the firmware is reporting duplicate
> OPPs, which the kernel is now ignoring without any other side effects
> than the warnings.

Right. Not something that would explain the hard reset behaviour then.

> 
> The side-effects and these remaining warnings are addressed by this
> series:
> 
> 	https://lore.kernel.org/all/20241030125512.2884761-1-quic_sibis@quicinc.com/
> 
> but I think we should try to make the warnings a bit more informative
> (and less scary) by printing something along the lines of:
> 
> 	arm-scmi arm-scmi.0.auto: [Firmware Bug]: Ignoring duplicate OPP 3417600 for NCC
> 
> instead.

Indeed. Seeing [Firmware Bug] has a comforting feeling of
familiarity... :)

I wonder whether the same sort of reset happen on more "commercial"
systems (such as some of the laptops). You expect that people look at
the cpufreq stuff closely, and don't see things exploding like we are.

	M.

-- 
Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ