[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2b8ce280-cb91-fb23-d19a-00dcee2a3e5a@arm.com>
Date: Tue, 8 Dec 2020 09:36:56 +0000
From: Lukasz Luba <lukasz.luba@....com>
To: Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: rui.zhang@...el.com, Thara Gopinath <thara.gopinath@...aro.org>,
Amit Kucheria <amitk@...nel.org>, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] thermal/core: Emit a warning if the thermal zone is
updated without ops
Hi Daniel,
On 12/7/20 7:05 PM, Daniel Lezcano wrote:
> The actual code is silently ignoring a thermal zone update when a
> driver is requesting it without a get_temp ops set.
>
> That looks not correct, as the caller should not have called this
> function if the thermal zone is unable to read the temperature.
>
> That makes the code less robust as the check won't detect the driver
> is inconsistently using the thermal API and that does not help to
> improve the framework as these circumvolutions hide the problem at the
> source.
Make sense.
>
> In order to detect the situation when it happens, let's add a warning
> when the update is requested without the get_temp() ops set.
>
> Any warning emitted will have to be fixed at the source of the
> problem: the caller must not call thermal_zone_device_update if there
> is not get_temp callback set.
>
> As the check is done in thermal_zone_get_temperature() via the
> update_temperature() function, it is pointless to have the check and
> the WARN in the thermal_zone_device_update() function. Just remove the
> check and let the next call to raise the warning.
>
> Cc: Thara Gopinath <thara.gopinath@...aro.org>
> Cc: Amit Kucheria <amitk@...nel.org>
> Cc: linux-pm@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org
> Signed-off-by: Daniel Lezcano <daniel.lezcano@...aro.org>
> ---
> drivers/thermal/thermal_core.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
> index 90e38cc199f4..1bd23ff2247b 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -448,17 +448,17 @@ static void handle_thermal_trip(struct thermal_zone_device *tz, int trip)
> monitor_thermal_zone(tz);
> }
>
> -static void update_temperature(struct thermal_zone_device *tz)
> +static int update_temperature(struct thermal_zone_device *tz)
> {
> int temp, ret;
>
> ret = thermal_zone_get_temp(tz, &temp);
> if (ret) {
> if (ret != -EAGAIN)
> - dev_warn(&tz->device,
> - "failed to read out thermal zone (%d)\n",
> - ret);
> - return;
> + dev_warn_once(&tz->device,
> + "failed to read out thermal zone (%d)\n",
> + ret);
> + return ret;
> }
>
> mutex_lock(&tz->lock);
> @@ -469,6 +469,8 @@ static void update_temperature(struct thermal_zone_device *tz)
> trace_thermal_temperature(tz);
>
> thermal_genl_sampling_temp(tz->id, temp);
> +
> + return 0;
> }
>
> static void thermal_zone_device_init(struct thermal_zone_device *tz)
> @@ -553,11 +555,9 @@ void thermal_zone_device_update(struct thermal_zone_device *tz,
> if (atomic_read(&in_suspend))
> return;
>
> - if (!tz->ops->get_temp)
> + if (update_temperature(tz))
> return;
>
> - update_temperature(tz);
> -
I think the patch does a bit more. Previously we continued running the
code below even when the thermal_zone_get_temp() returned an error (due
to various reasons). Now we stop and probably would not schedule next
polling, not calling:
handle_thermal_trip() and monitor_thermal_zone()
I would left update_temperature(tz) as it was and not check the return.
The function thermal_zone_get_temp() can protect itself from missing
tz->ops->get_temp(), so we should be safe.
What do you think?
Regards,
Lukasz
Powered by blists - more mailing lists