lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 25 Feb 2022 15:02:27 +0100
From:   Daniel Lezcano <daniel.lezcano@...aro.org>
To:     Benjamin Li <benl@...areup.com>, Amit Kucheria <amitk@...nel.org>,
        Thara Gopinath <thara.gopinath@...aro.org>
Cc:     Bjorn Andersson <bjorn.andersson@...aro.org>,
        Zac Crosby <zac@...areup.com>, Andy Gross <agross@...nel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Zhang Rui <rui.zhang@...el.com>, linux-arm-msm@...r.kernel.org,
        linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] drivers: thermal: tsens: respect thermal_device_mode
 in threshold irq reporting

On 20/01/2022 21:01, Benjamin Li wrote:
> 'echo disabled > .../thermal_zoneX/mode' will disable the thermal core's
> polling mechanism to check for threshold trips. This is used sometimes to
> run performance test cases.
> 
> However, tsens supports an interrupt mechanism to receive notification of
> trips, implemented in commit 634e11d5b450 ("drivers: thermal: tsens: Add
> interrupt support").
> 
> Currently the thermal zone mode that's set by userspace is not checked
> before propagating threshold trip events from IRQs. Let's fix this to
> restore the abilty to disable thermal throttling at runtime.
> 
> ====================
> 
> Tested on MSM8939 running 5.16.0. This platform has 8 cores; the first
> four thermal zones control cpu0-3 and the last zone is for the other four
> CPUs together.
> 
>    for f in /sys/class/thermal/thermal_zone*; do
>      echo "disabled" > $f/mode
>      echo $f | paste - $f/type $f/mode
>    done
> 
> /sys/class/thermal/thermal_zone0	cpu0-thermal	disabled
> /sys/class/thermal/thermal_zone1	cpu1-thermal	disabled
> /sys/class/thermal/thermal_zone2	cpu2-thermal	disabled
> /sys/class/thermal/thermal_zone3	cpu3-thermal	disabled
> /sys/class/thermal/thermal_zone4	cpu4567-thermal	disabled
> 
> With mitigation thresholds at 75 degC and load running, we can now cruise
> past temp=75000 without CPU throttling kicking in.
> 
>    watch -n 1 "grep '' /sys/class/thermal/*/temp
>        /sys/class/thermal/*/cur_state
>        /sys/bus/cpu/devices/cpu*/cpufreq/cpuinfo_cur_freq"
> 
> /sys/class/thermal/thermal_zone0/temp:82000
> /sys/class/thermal/thermal_zone1/temp:84000
> /sys/class/thermal/thermal_zone2/temp:87000
> /sys/class/thermal/thermal_zone3/temp:84000
> /sys/class/thermal/thermal_zone4/temp:84000
> /sys/class/thermal/cooling_device0/cur_state:0
> /sys/class/thermal/cooling_device1/cur_state:0
> /sys/bus/cpu/devices/cpu0/cpufreq/cpuinfo_cur_freq:1113600
> /sys/bus/cpu/devices/cpu1/cpufreq/cpuinfo_cur_freq:1113600
> /sys/bus/cpu/devices/cpu2/cpufreq/cpuinfo_cur_freq:1113600
> /sys/bus/cpu/devices/cpu3/cpufreq/cpuinfo_cur_freq:1113600
> /sys/bus/cpu/devices/cpu4/cpufreq/cpuinfo_cur_freq:800000
> /sys/bus/cpu/devices/cpu5/cpufreq/cpuinfo_cur_freq:800000
> /sys/bus/cpu/devices/cpu6/cpufreq/cpuinfo_cur_freq:800000
> /sys/bus/cpu/devices/cpu7/cpufreq/cpuinfo_cur_freq:800000
> 
> Reported-by: Zac Crosby <zac@...areup.com>
> Reviewed-by: Bjorn Andersson <bjorn.andersson@...aro.org>
> Signed-off-by: Benjamin Li <benl@...areup.com>
> ---
> Changes in v3:
> - Upgraded logging to dev_info_ratelimited and revised log message.
> - Remove unrelated hunk.
> 
> Some drivers that support thermal zone disabling implement a set_mode
> operation and simply disable the sensor or the relevant IRQ(s), so they
> actually don't log anything when zones are disabled. These drivers are
> imx_thermal.c, intel_quark_dts_thermal.c, and int3400_thermal.c.
> 
> For tsens.c, implementing a change_mode would require migrating the driver
> from devm_thermal_zone_of_sensor_register to thermal_zone_device_register
> (or updating thermal_of.c to add a change_mode operation in thermal_zone_
> of_device_ops).
> 
> stm_thermal.c seems to use this patch's model of not disabling IRQs when
> the zone is disabled (they still perform the thermal_zone_device_update
> upon IRQ, but return -EAGAIN from their get_temp).

What is the concern by changing the core code to have a correct handling 
of the disabled / enabled state in this driver ? (and by this way give 
the opportunity to other drivers to fix their code)


> Changes in v2:
> - Reordered sentences in first part of commit message to make sense.
> 
>   drivers/thermal/qcom/tsens.c | 12 ++++++++----
>   1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/thermal/qcom/tsens.c b/drivers/thermal/qcom/tsens.c
> index 99a8d9f3e03c..dd0002829536 100644
> --- a/drivers/thermal/qcom/tsens.c
> +++ b/drivers/thermal/qcom/tsens.c
> @@ -509,10 +509,14 @@ static irqreturn_t tsens_irq_thread(int irq, void *data)
>   		spin_unlock_irqrestore(&priv->ul_lock, flags);
>   
>   		if (trigger) {
> -			dev_dbg(priv->dev, "[%u] %s: TZ update trigger (%d mC)\n",
> -				hw_id, __func__, temp);
> -			thermal_zone_device_update(s->tzd,
> -						   THERMAL_EVENT_UNSPECIFIED);
> +			if (s->tzd->mode == THERMAL_DEVICE_ENABLED) {
> +				dev_dbg(priv->dev, "[%u] %s: TZ update trigger (%d mC)\n",
> +					hw_id, __func__, temp);
> +				thermal_zone_device_update(s->tzd, THERMAL_EVENT_UNSPECIFIED);
> +			} else {
> +				dev_info_ratelimited(priv->dev, "[%u] %s: TZ update trigger (%d mC) skipped - zone disabled, operating outside of safety limits!\n",
> +					hw_id, __func__, temp);
> +			}
>   		} else {
>   			dev_dbg(priv->dev, "[%u] %s: no violation:  %d\n",
>   				hw_id, __func__, temp);


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ