lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZICwP9MqJYwrw0HW@uf8f119305bce5e.ant.amazon.com>
Date:   Wed, 7 Jun 2023 09:28:47 -0700
From:   Eduardo Valentin <evalenti@...nel.org>
To:     "Zhang, Rui" <rui.zhang@...el.com>
Cc:     "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
        "Valentin, Eduardo" <eduval@...zon.com>,
        "rafael@...nel.org" <rafael@...nel.org>,
        "evalenti@...nel.org" <evalenti@...nel.org>,
        "daniel.lezcano@...aro.org" <daniel.lezcano@...aro.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "amitk@...nel.org" <amitk@...nel.org>
Subject: Re: [PATCH 1/1] thermal: sysfs: avoid actual readings from sysfs

Rui!

Long time no chatting! In this case, no email exchange. Good to hear from you.

On Wed, Jun 07, 2023 at 06:32:46AM +0000, Zhang, Rui wrote:
> 
> 
> 
> On Tue, 2023-06-06 at 17:37 -0700, Eduardo Valentin wrote:
> > From: Eduardo Valentin <eduval@...zon.com>
> >
> > As the thermal zone caches the current and last temperature
> > value, the sysfs interface can use that instead of
> > forcing an actual update or read from the device.
> > This way, if multiple userspace requests are coming
> > in, we avoid storming the device with multiple reads
> > and potentially clogging the timing requirement
> > for the governors.
> >
> > Cc: "Rafael J. Wysocki" <rafael@...nel.org> (supporter:THERMAL)
> > Cc: Daniel Lezcano <daniel.lezcano@...aro.org> (supporter:THERMAL)
> > Cc: Amit Kucheria <amitk@...nel.org> (reviewer:THERMAL)
> > Cc: Zhang Rui <rui.zhang@...el.com> (reviewer:THERMAL)
> > Cc: linux-pm@...r.kernel.org (open list:THERMAL)
> > Cc: linux-kernel@...r.kernel.org (open list)
> >
> > Signed-off-by: Eduardo Valentin <eduval@...zon.com>
> > ---
> >  drivers/thermal/thermal_sysfs.c | 21 ++++++++++++++++-----
> >  1 file changed, 16 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/thermal/thermal_sysfs.c
> > b/drivers/thermal/thermal_sysfs.c
> > index b6daea2398da..a240c58d9e08 100644
> > --- a/drivers/thermal/thermal_sysfs.c
> > +++ b/drivers/thermal/thermal_sysfs.c
> > @@ -35,12 +35,23 @@ static ssize_t
> >  temp_show(struct device *dev, struct device_attribute *attr, char
> > *buf)
> >  {
> >         struct thermal_zone_device *tz = to_thermal_zone(dev);
> > -       int temperature, ret;
> > -
> > -       ret = thermal_zone_get_temp(tz, &temperature);
> > +       int temperature;
> >
> > -       if (ret)
> > -               return ret;
> > +       /*
> > +        * don't force new update from external reads
> > +        * This way we avoid messing up with time constraints.
> > +        */
> > +       if (tz->mode == THERMAL_DEVICE_DISABLED) {
> > +               int r;
> > +
> > +               r = thermal_zone_get_temp(tz, &temperature); /* holds
> > tz->lock*/
> 
> what is the expected behavior of a disabled zone?
> 
> IMO, the hardware may not be functional at this point, and reading the
> temperature should be avoided, as we do in
> __thermal_zone_device_update().
> 
> should we just return failure in this case?
> 
> userspace should poke the temp attribute for enabled zones only.

While I see your point, My understanding is that thermal zone mode
is either kernel mode or userspace mode, which to my interpretation,
it dictating where the control is, not that there is a malfunction,
necessarily.

Taking that perspective, the expected behavior here is to have a
in userspace control/governor, where it:
1. disables the in kernel control
2. monitors the thermal zone by reading the /temp property
3. Actuates on the assigned cooling devices for the thermal zone.

The above setup works pretty well for non critical control, where
the system design or state does not require an in kernel control.
And for that scenario, the proposed cached value will not be updated
given that the in kernel thread is not collecting/updating temperature
values anymore, therefore, the sysfs entry has to talk to the
driver to get the most current value.

For the failure case you referred to, Rui, This patch will handle it
too. It will talk to the driver, if the device is malfunction, the
driver will return an error which will be reported back
to userspace, as an error code upon read, which is expected behavior
for userspace to know that there is a problem.

> 
> thanks,
> rui
> > +               if (r)
> > +                       return r;
> > +       } else {
> > +               mutex_lock(&tz->lock);
> > +               temperature = tz->temperature;
> > +               mutex_unlock(&tz->lock);
> > +       }
> >
> >         return sprintf(buf, "%d\n", temperature);
> >  }
> 

-- 
All the best,
Eduardo Valentin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ