[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8f0a9bd6-52dd-442f-b0fd-73cf7028d9f0@roeck-us.net>
Date: Thu, 6 Feb 2025 10:57:10 -0800
From: Guenter Roeck <linux@...ck-us.net>
To: Werner Sembach <wse@...edocomputers.com>
Cc: hdegoede@...hat.com, ilpo.jarvinen@...ux.intel.com, ukleinek@...nel.org,
jdelvare@...e.com, linux-kernel@...r.kernel.org,
platform-driver-x86@...r.kernel.org, linux-pwm@...r.kernel.org,
linux-hwmon@...r.kernel.org
Subject: Re: [RFC PATCH 1/1] platform/x86/tuxedo: Implement TUXEDO TUXI ACPI
TFAN via hwmon
On Thu, Feb 06, 2025 at 10:28:01AM +0100, Werner Sembach wrote:
[ ... ]
> > > + temp = retval * 100 - 272000;
> > > +
> > > + for (j = 0; temp_levels[j].temp; ++j) {
> > > + temp_low = j == 0 ? -272000 : temp_levels[j-1].temp;
> > > + temp_high = temp_levels[j].temp;
> > > + if (driver_data->temp_level[i] > j)
> > > + temp_high -= 2000; // hysteresis
> > > +
> > > + if (temp >= temp_low && temp < temp_high)
> > > + driver_data->temp_level[i] = j;
> > > + }
> > > + if (temp >= temp_high)
> > > + driver_data->temp_level[i] = j;
> > > +
> > > + temp_level = driver_data->temp_level[i];
> > > + min_speed = temp_level == 0 ?
> > > + 0 : temp_levels[temp_level-1].min_speed;
> > > + curr_speed = driver_data->curr_speed[i];
> > > + want_speed = driver_data->want_speed[i];
> > > +
> > > + if (want_speed < min_speed) {
> > > + if (curr_speed < min_speed)
> > > + write_speed(dev, i, min_speed);
> > > + } else if (curr_speed != want_speed)
> > > + write_speed(dev, i, want_speed);
> > > + }
> > > +
> > > + schedule_delayed_work(&driver_data->work, TUXI_SAFEGUARD_PERIOD);
> > > +}
> >
> > This is not expected functionality of a hardware monitoring driver.
> > Hardware monmitoring drivers should not replicate userspace or
> > thermal subsystem functionality.
> >
> > This would be unacceptable in drivers/hwmon/.
>
> Problem is: The thermal subsystem doesn't do this either as far as I can tell.
>
> See this: https://lore.kernel.org/all/453e0df5-416b-476e-9629-c40534ecfb72@tuxedocomputers.com/
> and this: https://lore.kernel.org/all/41483e2b-361b-4b84-88a7-24fc1eaae745@tuxedocomputers.com/
> thread.
>
> The short version is: The Thermal subsystem always allows userspace to
> select the "userspace" governor which has no way for the kernel to enforce a
> minimum speed.
>
You can specify thermal parameters / limits using devicetree. Also, drivers
can always enforce value ranges.
> As far as I can tell the Thermal subsystem would require a new governor for
> the behavior i want to archive and more importantly, a way to restrict which
> governors userspace can select.
>
> As to why I don't want grant userspace full control: The firmware is
> perfectly fine with accepting potentially mainboard frying settings (as
> mentioned in the cover letter) and the lowest level I can write code for is
> the kernel driver. So that's the location I need to prevent this.
>
It is ok for the kernel to accept and enforce _limits_ (such as lower and upper
ranges for temperatures) when they are written. That is not what the code here
does.
> Also hwmon is not purely a hardware monitoring, it also allows writing
> fanspeeds. Or did I miss something and this shouldn't actually be used?
>
If doesn't actively control fan speeds, though. It just tells the firmware what
the limits or target values are.
> >
> > Personally I think this is way too complicated. It would make much more sense
> > to assume a reasonable maximum (say, 16) and use fixed size arrays to access
> > the data. The is_visible function can then simply return 0 for larger channel
> > values if the total number of fans is less than the ones configured in the
> > channel information.
> Didn't know it was possible to filter extra entries out completely with the
> is_visible function, thanks for the tip.
> >
> > Also, as already mentioned, there is no range check of fan_count. This will
> > cause some oddities if the system ever claims to have 256+ fans.
> Will not happen, but i guess a singular additional if in the init doesn't
> hurt, i can add it.
You are making the assumption that the firmware always provides correct
values.
I fully agree that repeated range checks for in-kernel API functions are
useless. However, values should still be checked when a value enters
the kernel, either via userspace or via hardware, even more so if that value
is used to determine, like here, the amount of memory allocated. Or, worse,
if the value is reported as 32-bit value and written into an 8-byte variable.
> >
> > > + *hwmdev = devm_hwmon_device_register_with_info(&pdev->dev,
> > > + "tuxedo_nbxx_acpi_tuxi",
> > > + driver_data, &hwminfo,
> > > + NULL);
> > > + if (PTR_ERR_OR_ZERO(*hwmdev))
> > > + return PTR_ERR_OR_ZERO(*hwmdev);
> > > +
> > Why not just return hwmdev ?
> because if hwmon is NULL it is still an error, i have to look again at what
> actually is returned by PTR_ERR_OR_ZERO on zero.
That seems a bit philosophical. The caller would have to check for
PTR_ERR_OR_ZERO() instead of checking for < 0.
On a side note, the code now returns 0 if devm_hwmon_device_register_with_info()
returned NULL. devm_hwmon_device_register_with_info() never returns NULL,
so that doesn't make a difference in practice, but, still, this should
at least use PTR_ERR().
Guenter
Powered by blists - more mailing lists