lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201029100335.27665-1-peter.ujfalusi@ti.com>
Date:   Thu, 29 Oct 2020 12:03:35 +0200
From:   Peter Ujfalusi <peter.ujfalusi@...com>
To:     <edubezval@...il.com>, <j-keerthy@...com>, <aford173@...il.com>
CC:     <linux-pm@...r.kernel.org>, <linux-omap@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>, <andreas@...nade.info>,
        <tony@...mide.com>, <daniel.lezcano@...aro.org>
Subject: [PATCH] thermal: ti-soc-thermal: Disable the CPU PM notifier for OMAP4430

It has been observed that on OMAP4430 (ES2.0, ES2.1 and ES2.3) the enabled
notifier causes errors on the DTEMP readout values:

ti-soc-thermal 4a002260.bandgap: in range ADC val: 52
ti-soc-thermal 4a002260.bandgap: in range ADC val: 64
ti-soc-thermal 4a002260.bandgap: in range ADC val: 64
ti-soc-thermal 4a002260.bandgap: out of range ADC val: 0
thermal thermal_zone0: failed to read out thermal zone (-5)
ti-soc-thermal 4a002260.bandgap: out of range ADC val: 0
thermal thermal_zone0: failed to read out thermal zone (-5)
ti-soc-thermal 4a002260.bandgap: out of range ADC val: 4
thermal thermal_zone0: failed to read out thermal zone (-5)
ti-soc-thermal 4a002260.bandgap: in range ADC val: 100

raw 100 translates to 133 Celsius on omap4-sdp, triggering shutdown due to
critical temperature.

When the notifier is disable for OMAP4430 the DTEMP values are stable:
ti-soc-thermal 4a002260.bandgap: in range ADC val: 56
ti-soc-thermal 4a002260.bandgap: in range ADC val: 56
ti-soc-thermal 4a002260.bandgap: in range ADC val: 57
ti-soc-thermal 4a002260.bandgap: in range ADC val: 57
ti-soc-thermal 4a002260.bandgap: in range ADC val: 56

Fixes: 5093402e5b44 ("thermal: ti-soc-thermal: Enable addition power management")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@...com>
---
Hi,

my omap4-sdp (Blaze) was shutting down randomly due to critical temperature with
5.10-rc1 and I have bisected it back to 5093402e5b44.

Disabling the notifier fixes the random shutdowns on OMAP4430 (ES2.0 and ES2.1)
but it does not cause any issues on OMAP4460 (PandaES) or OMAP3630 (BeagleXM).
Tony's duovero with OMAP4430 ES2.3 did not ninja-shutdown, but he also have
constant and steady stream of:
thermal thermal_zone0: failed to read out thermal zone (-5)

pointing to similar issue.

Regards,
Peter

 drivers/thermal/ti-soc-thermal/ti-bandgap.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/ti-soc-thermal/ti-bandgap.c b/drivers/thermal/ti-soc-thermal/ti-bandgap.c
index 5e596168ba73..dcac99f327b0 100644
--- a/drivers/thermal/ti-soc-thermal/ti-bandgap.c
+++ b/drivers/thermal/ti-soc-thermal/ti-bandgap.c
@@ -20,6 +20,7 @@
 #include <linux/err.h>
 #include <linux/types.h>
 #include <linux/spinlock.h>
+#include <linux/sys_soc.h>
 #include <linux/reboot.h>
 #include <linux/of_device.h>
 #include <linux/of_platform.h>
@@ -864,6 +865,17 @@ static struct ti_bandgap *ti_bandgap_build(struct platform_device *pdev)
 	return bgp;
 }
 
+/*
+ * List of SoCs on which the CPU PM notifier can cause erros on the DTEMP
+ * readout.
+ * Enabled notifier on these machines results in erroneous, random values which
+ * could trigger unexpected thermal shutdown.
+ */
+static const struct soc_device_attribute soc_no_cpu_notifier[] = {
+	{ .machine = "OMAP4430" },
+	{ /* sentinel */ },
+};
+
 /***   Device driver call backs   ***/
 
 static
@@ -1020,7 +1032,8 @@ int ti_bandgap_probe(struct platform_device *pdev)
 
 #ifdef CONFIG_PM_SLEEP
 	bgp->nb.notifier_call = bandgap_omap_cpu_notifier;
-	cpu_pm_register_notifier(&bgp->nb);
+	if (!soc_device_match(soc_no_cpu_notifier))
+		cpu_pm_register_notifier(&bgp->nb);
 #endif
 
 	return 0;
@@ -1056,7 +1069,8 @@ int ti_bandgap_remove(struct platform_device *pdev)
 	struct ti_bandgap *bgp = platform_get_drvdata(pdev);
 	int i;
 
-	cpu_pm_unregister_notifier(&bgp->nb);
+	if (!soc_device_match(soc_no_cpu_notifier))
+		cpu_pm_unregister_notifier(&bgp->nb);
 
 	/* Remove sensor interfaces */
 	for (i = 0; i < bgp->conf->sensor_count; i++) {
-- 
Peter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ