[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b32284b7-ddc8-4fb5-82f8-20199b0dec5a@tuxedocomputers.com>
Date: Thu, 6 Feb 2025 23:55:01 +0100
From: Werner Sembach <wse@...edocomputers.com>
To: Guenter Roeck <linux@...ck-us.net>
Cc: hdegoede@...hat.com, ilpo.jarvinen@...ux.intel.com, ukleinek@...nel.org,
jdelvare@...e.com, linux-kernel@...r.kernel.org,
platform-driver-x86@...r.kernel.org, linux-pwm@...r.kernel.org,
linux-hwmon@...r.kernel.org
Subject: Re: [RFC PATCH 1/1] platform/x86/tuxedo: Implement TUXEDO TUXI ACPI
TFAN via hwmon
Am 06.02.25 um 19:57 schrieb Guenter Roeck:
> On Thu, Feb 06, 2025 at 10:28:01AM +0100, Werner Sembach wrote:
>
> [ ... ]
>
>>>> + temp = retval * 100 - 272000;
>>>> +
>>>> + for (j = 0; temp_levels[j].temp; ++j) {
>>>> + temp_low = j == 0 ? -272000 : temp_levels[j-1].temp;
>>>> + temp_high = temp_levels[j].temp;
>>>> + if (driver_data->temp_level[i] > j)
>>>> + temp_high -= 2000; // hysteresis
>>>> +
>>>> + if (temp >= temp_low && temp < temp_high)
>>>> + driver_data->temp_level[i] = j;
>>>> + }
>>>> + if (temp >= temp_high)
>>>> + driver_data->temp_level[i] = j;
>>>> +
>>>> + temp_level = driver_data->temp_level[i];
>>>> + min_speed = temp_level == 0 ?
>>>> + 0 : temp_levels[temp_level-1].min_speed;
>>>> + curr_speed = driver_data->curr_speed[i];
>>>> + want_speed = driver_data->want_speed[i];
>>>> +
>>>> + if (want_speed < min_speed) {
>>>> + if (curr_speed < min_speed)
>>>> + write_speed(dev, i, min_speed);
>>>> + } else if (curr_speed != want_speed)
>>>> + write_speed(dev, i, want_speed);
>>>> + }
>>>> +
>>>> + schedule_delayed_work(&driver_data->work, TUXI_SAFEGUARD_PERIOD);
>>>> +}
>>> This is not expected functionality of a hardware monitoring driver.
>>> Hardware monmitoring drivers should not replicate userspace or
>>> thermal subsystem functionality.
>>>
>>> This would be unacceptable in drivers/hwmon/.
>> Problem is: The thermal subsystem doesn't do this either as far as I can tell.
>>
>> See this: https://lore.kernel.org/all/453e0df5-416b-476e-9629-c40534ecfb72@tuxedocomputers.com/
>> and this: https://lore.kernel.org/all/41483e2b-361b-4b84-88a7-24fc1eaae745@tuxedocomputers.com/
>> thread.
>>
>> The short version is: The Thermal subsystem always allows userspace to
>> select the "userspace" governor which has no way for the kernel to enforce a
>> minimum speed.
>>
> You can specify thermal parameters / limits using devicetree. Also, drivers
> can always enforce value ranges.
Sorry for my noob question: What do you mean with devicetree in x86 context?
I only want to enforce a value range at a certain temperature, if the
device is cool, the fan can be turned off for example.
>
>> As far as I can tell the Thermal subsystem would require a new governor for
>> the behavior i want to archive and more importantly, a way to restrict which
>> governors userspace can select.
>>
>> As to why I don't want grant userspace full control: The firmware is
>> perfectly fine with accepting potentially mainboard frying settings (as
>> mentioned in the cover letter) and the lowest level I can write code for is
>> the kernel driver. So that's the location I need to prevent this.
>>
> It is ok for the kernel to accept and enforce _limits_ (such as lower and upper
> ranges for temperatures) when they are written. That is not what the code here
> does.
It conditionally enforces a minimum fanspeed.
So is the problem that hwmon drivers are only allowed to enforce
unconditional limits?
>
>> Also hwmon is not purely a hardware monitoring, it also allows writing
>> fanspeeds. Or did I miss something and this shouldn't actually be used?
>>
> If doesn't actively control fan speeds, though. It just tells the firmware what
> the limits or target values are.
What is the difference if it tells the firmware a target fanspeed, which
can be ignored by it, or a driver a target fanspeed, which can be
ignored by it?
>
>>> Personally I think this is way too complicated. It would make much more sense
>>> to assume a reasonable maximum (say, 16) and use fixed size arrays to access
>>> the data. The is_visible function can then simply return 0 for larger channel
>>> values if the total number of fans is less than the ones configured in the
>>> channel information.
>> Didn't know it was possible to filter extra entries out completely with the
>> is_visible function, thanks for the tip.
>>> Also, as already mentioned, there is no range check of fan_count. This will
>>> cause some oddities if the system ever claims to have 256+ fans.
>> Will not happen, but i guess a singular additional if in the init doesn't
>> hurt, i can add it.
> You are making the assumption that the firmware always provides correct
> values.
>
> I fully agree that repeated range checks for in-kernel API functions are
> useless. However, values should still be checked when a value enters
> the kernel, either via userspace or via hardware, even more so if that value
> is used to determine, like here, the amount of memory allocated. Or, worse,
> if the value is reported as 32-bit value and written into an 8-byte variable.
ok
>
>>>> + *hwmdev = devm_hwmon_device_register_with_info(&pdev->dev,
>>>> + "tuxedo_nbxx_acpi_tuxi",
>>>> + driver_data, &hwminfo,
>>>> + NULL);
>>>> + if (PTR_ERR_OR_ZERO(*hwmdev))
>>>> + return PTR_ERR_OR_ZERO(*hwmdev);
>>>> +
>>> Why not just return hwmdev ?
>> because if hwmon is NULL it is still an error, i have to look again at what
>> actually is returned by PTR_ERR_OR_ZERO on zero.
> That seems a bit philosophical. The caller would have to check for
> PTR_ERR_OR_ZERO() instead of checking for < 0.
>
> On a side note, the code now returns 0 if devm_hwmon_device_register_with_info()
> returned NULL. devm_hwmon_device_register_with_info() never returns NULL,
> so that doesn't make a difference in practice, but, still, this should
> at least use PTR_ERR().
ok
>
> Guenter
Powered by blists - more mailing lists