lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPDyKFqKibp2d7GHZwxi1Kf3oPhM+wF1c+YEfO3viRc0HSufwA@mail.gmail.com>
Date:   Thu, 17 Aug 2023 23:40:33 +0200
From:   Ulf Hansson <ulf.hansson@...aro.org>
To:     Frank Li <Frank.li@....com>
Cc:     Daniel Lezcano <daniel.lezcano@...aro.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Amit Kucheria <amitk@...nel.org>,
        Zhang Rui <rui.zhang@...el.com>,
        Shawn Guo <shawnguo@...nel.org>,
        Sascha Hauer <s.hauer@...gutronix.de>,
        Pengutronix Kernel Team <kernel@...gutronix.de>,
        Fabio Estevam <festevam@...il.com>,
        NXP Linux Team <linux-imx@....com>,
        "open list:THERMAL" <linux-pm@...r.kernel.org>,
        "moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE" 
        <linux-arm-kernel@...ts.infradead.org>,
        open list <linux-kernel@...r.kernel.org>, imx@...ts.linux.dev
Subject: Re: [PATCH 1/1] thermal/drivers/imx_sc_thermal: return -EAGAIN when
 SCFW turn off resource

On Thu, 17 Aug 2023 at 17:31, Frank Li <Frank.li@....com> wrote:
>
> On Wed, Aug 16, 2023 at 11:23:17PM +0200, Ulf Hansson wrote:
> > On Wed, 16 Aug 2023 at 22:46, Daniel Lezcano <daniel.lezcano@...aro.org> wrote:
> > >
> > > On 16/08/2023 19:07, Frank Li wrote:
> > > > On Wed, Aug 16, 2023 at 06:47:17PM +0200, Daniel Lezcano wrote:
> > > >> On 16/08/2023 18:28, Frank Li wrote:
> > > >>> On Wed, Aug 16, 2023 at 10:44:32AM +0200, Daniel Lezcano wrote:
> > > >>>>
> > > >>>> Hi Frank,
> > > >>>>
> > > >>>> sorry for the delay
> > > >>>>
> > > >>>> On 14/07/2023 19:19, Frank Li wrote:
> > > >>>>> On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote:
> > > >>>>>> On 12/07/2023 23:05, Frank Li wrote:
> > > >>>>>>> Avoid endless print following message when SCFW turns off resource.
> > > >>>>>>>      [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1)
> > > >>>>>>>
> > > >>>>>>> Signed-off-by: Frank Li <Frank.Li@....com>
> > > >>>>>>> ---
> > > >>>>>>>      drivers/thermal/imx_sc_thermal.c | 4 +++-
> > > >>>>>>>      1 file changed, 3 insertions(+), 1 deletion(-)
> > > >>>>>>>
> > > >>>>>>> diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c
> > > >>>>>>> index 8d6b4ef23746..0533d58f199f 100644
> > > >>>>>>> --- a/drivers/thermal/imx_sc_thermal.c
> > > >>>>>>> +++ b/drivers/thermal/imx_sc_thermal.c
> > > >>>>>>> @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp)
> > > >>>>>>>         hdr->size = 2;
> > > >>>>>>>         ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true);
> > > >>>>>>> -       if (ret)
> > > >>>>>>> +       if (ret == -EPERM) /* NO POWER */
> > > >>>>>>> +               return -EAGAIN;
> > > >>>>>>
> > > >>>>>> Isn't there a chain call somewhere when the resource is turned off, so the
> > > >>>>>> thermal zone can be disabled?
> > > >>>>>
> > > >>>>> A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to
> > > >>>>> get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I
> > > >>>>> am not sure if it is good to depend on "name", which add coupling between
> > > >>>>> two drivers and if there are external thermal devices(such as) has the
> > > >>>>> same name, it will wrong turn off.
> > > >>>>
> > > >>>> Correct
> > > >>>>
> > > >>>>> If add power domain notification in thermal driver, I am not how to get
> > > >>>>> other devices's pd in thermal driver.
> > > >>>>>
> > > >>>>> Any example I can refer?
> > > >>>>>
> > > >>>>> Or this is simple enough solution.
> > > >>>>
> > > >>>> The solution works for removing the error message but it does not solve the
> > > >>>> root cause of the issue. The thermal zone keeps monitoring while the sensor
> > > >>>> is down.
> > > >>>>
> > > >>>> So the question is why the sensor is shut down if it is in use?
> > > >>>
> > > >>> Do you know if there are any code I reference? I supposed it is quite common.
> > > >>
> > > >> Sorry, I don't get your comment
> > > >>
> > > >> What I meant is why is the sensor turned off if it is in use ?
> > > >
> > > > One typical example is cpu hotplug. The sensor is located CPU power domain.
> > > > If CPU hotplug off,  CPU power domain will be turn off.
> > > >
> > > > It doesn't make sensor keep monitor such sensor when CPU already power off.
> > > > It doesn't make sensor to keep CPU power on just because want to get sensor
> > > > data.
> > > >
> > > > Anthor example is GPU, if there are GPU0 and GPU1. Most case just GPU0
> > > > work.  GPU1 may turn off when less loading.
> > > >
> > > > Ideally, thermal can get notification from power domain driver.
> > > > when such power domain turn off,  disable thermal zone.
> > > >
> > > > So far, I have not idea how to do that.
> > >
> > > Ulf,
> > >
> > > do you have a guidance to link the thermal zone and the power domain in
> > > order to get a poweron/off notification leading to enable/disable the
> > > thermal zone ?
> >
> > I don't know the details here, so apologize for my ignorance to start
> > with. What platform is this?
>
> i.MX8QM.

Thanks!

>
> >
> > A vague idea could be to hook up the thermal sensor to the
> > corresponding CPU power domain. Assuming the CPU power domain is
> > modelled as a genpd provider, then this allows the driver for the
> > thermal sensor to register for power-on/off notifications of the genpd
> > (see dev_pm_genpd_add_notifier()).
> >
> > Can this work?
>
> I don't think. dev_pm_genpd_ad_notifier() need a dev, which binded to pd.

Yes, correct.

>
> tsens: thermal-sensor {
>         compatible = "fsl,imx-sc-thermal";
>         tsens-num = <6>;
>         #thermal-sensor-cells = <1>;
> };

Are you saying that the above doesn't have a corresponding struct
device created for it? That sounds like a problem that can be fixed,
right? Not sure if it makes sense though.

>
> we have 6 thermal-sensor, which assocated with 6 pd,
>         IMX_SC_R_SYSTEM, IMX_SC_R_PMIC_0,
>         IMX_SC_R_AP_0, IMX_SC_R_AP_1,
>         IMX_SC_R_GPU_0_PID0, IMX_SC_R_GPU_1_PID0,
>         IMX_SC_R_DRC_0
>
> We don't want to hold PD on just because want to get temperature. GPU pd
> consume much power.

Of course, that would be a bad idea it seems like.

The corresponding struct device that's hooked up to a genpd, can
remain runtime suspended as long as you think it makes sense. Thus it
would not keep the PM domain powered on when it isn't needed.

>
> I want to register one callback at thermal-sensor driver, when GPU pd on,
> enable thermal-zone. when GPU pd off, disable thermal zone.

Right, that should work fine too, I think. It seems like this is just
a matter of modelling this correctly in DT, I have no strong opinion
in this regard.

>
> we can do more common way.
>
>         gpu-thermal1 {
>                         polling-delay-passive = <250>;
>                         polling-delay = <2000>;
>         >>>             pd=<&GPU1_PD>
>                         thermal-sensors = <&tsens IMX_SC_R_GPU_1_PID0>;
>
>                 };
>
> if GPU1_PD on, then gpu-thermal1 enable,
> if GPU1_PD off, then gpu-thermal1 disable.
>

Sounds like it's worth a try! Please keep me posted.

Kind regards
Uffe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ