lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86plnf11yf.wl-maz@kernel.org>
Date: Fri, 01 Nov 2024 14:08:24 +0000
From: Marc Zyngier <maz@...nel.org>
To: Johan Hovold <johan@...nel.org>
Cc: Sibi Sankar <quic_sibis@...cinc.com>,
	sudeep.holla@....com,
	cristian.marussi@....com,
	andersson@...nel.org,
	konrad.dybcio@...aro.org,
	robh+dt@...nel.org,
	krzysztof.kozlowski+dt@...aro.org,
	dmitry.baryshkov@...aro.org,
	linux-kernel@...r.kernel.org,
	linux-arm-msm@...r.kernel.org,
	devicetree@...r.kernel.org,
	quic_rgottimu@...cinc.com,
	quic_kshivnan@...cinc.com,
	conor+dt@...nel.org,
	quic_nkela@...cinc.com,
	quic_psodagud@...cinc.com,
	abel.vesa@...aro.org
Subject: Re: [PATCH V7 0/2] qcom: x1e80100: Enable CPUFreq

On Fri, 01 Nov 2024 13:00:37 +0000,
Johan Hovold <johan@...nel.org> wrote:
> 
> [ +CC: Marc, who I think I saw reporting something similar even if I can
>   seem to find where right now ]

It was on IRC.

> 
> On Wed, Oct 30, 2024 at 06:38:38PM +0530, Sibi Sankar wrote:
> > This series enables CPUFreq support on the X1E SoC using the SCMI perf
> > protocol. This was originally part of the RFC: firmware: arm_scmi:
> > Qualcomm Vendor Protocol [1]. I've split it up so that this part can
> > land earlier. Warnings Introduced by the series are fixed by [2]
> 
>  Sibi Sankar (2):
> >   arm64: dts: qcom: x1e80100: Add cpucp mailbox and sram nodes
> >   arm64: dts: qcom: x1e80100: Enable cpufreq
> 
> I've been running with v6 of these for a while now, without noticing any
> issues, and just updated to v7 to be able to provide a Tested-by tag.
> 
> I wanted to run a compilation and see how the frequencies varied, but
> before I got around to that I just grepped the cpufreq sysfs attributes
> for CPU0 four times. And this triggered a reset of the machine (x1e80100
> CRD).
> 
> The last values output were:
> 
> 	affected_cpus:0 1 2 3
> 	cpuinfo_cur_freq:<unknown>
> 	cpuinfo_max_freq:3417600
> 	cpuinfo_min_freq:710400
> 	cpuinfo_transition_latency:30000
> 	related_cpus:0 1 2 3
> 	scaling_available_frequencies:710400 806400 998400 1190400 1440000 1670400 1920000 2188800 2515200 2707200 2976000 320
> 	scaling_available_governors:ondemand userspace performance schedutil
> 	scaling_cur_freq:806400
> 	scaling_driver:scmi
> 	scaling_governor:schedutil
> 	scaling_max_freq:3417600
> 	scaling_min_freq:710400
> 	scaling_setspeed:<unsupported>
> 
> Notice the <unknown> current frequency (the previous greps said 710400
> and 2515200).
> 
> The last thing I see on the serial console, presumably just before
> the reset, is:
> 
> 	[  196.268025] arm-scmi arm-scmi.0.auto: timed out in resp(caller: do_xfer+0x164/0x564)
> 
> I just rebooted and grepped again and it triggered on the first attempt
> (cur_freq also said '<unknown>'). Same error in the log, printed when
> grepping.

I'm seeing similar things indeed. Randomly grepping in cpufreq/policy*
results in hard resets, although I don't get much on the serial
console when that happens. Interestingly, I also see some errors in
dmesg at boot time:

maz@...i-fraudulent:~$ dmesg| grep -i scmi
[    0.966175] scmi_core: SCMI protocol bus registered
[    7.929710] arm-scmi arm-scmi.2.auto: Using scmi_mailbox_transport
[    7.939059] arm-scmi arm-scmi.2.auto: SCMI max-rx-timeout: 30ms
[    7.945567] arm-scmi arm-scmi.2.auto: SCMI RAW Mode initialized for instance 0
[    7.958348] arm-scmi arm-scmi.2.auto: SCMI RAW Mode COEX enabled !
[    7.978303] arm-scmi arm-scmi.2.auto: SCMI Notifications - Core Enabled.
[    7.985351] arm-scmi arm-scmi.2.auto: SCMI Protocol v2.0 'Qualcomm:' Firmware version 0x20000
[    8.033774] arm-scmi arm-scmi.2.auto: Failed to add opps_by_lvl at 3801600 for NCC - ret:-16
[    8.033902] arm-scmi arm-scmi.2.auto: Failed to add opps_by_lvl at 3801600 for NCC - ret:-16
[    8.036528] arm-scmi arm-scmi.2.auto: Failed to add opps_by_lvl at 3801600 for NCC - ret:-16
[    8.036744] arm-scmi arm-scmi.2.auto: Failed to add opps_by_lvl at 3801600 for NCC - ret:-16
[    8.171232] scmi-perf-domain scmi_dev.4: Initialized 3 performance domains

All these "Failed" are a bit worrying. Happy to put any theory to the
test.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ