lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <PA4PR04MB763051DCED9E30B9640BCDD0C5C82@PA4PR04MB7630.eurprd04.prod.outlook.com>
Date: Thu, 6 Mar 2025 13:35:53 +0000
From: Maud Spierings | GOcontroll <maudspierings@...ontroll.com>
To: Guenter Roeck <linux@...ck-us.net>, Jean Delvare <jdelvare@...e.com>
CC: "linux-hwmon@...r.kernel.org" <linux-hwmon@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] hwmon: (ntc_thermistor) return error instead of clipping
 on OOB

From: Guenter Roeck <groeck7@...il.com> on behalf of Guenter Roeck <linux@...ck-us.net>
Sent: Tuesday, March 4, 2025 4:04 PM
 
>On 3/4/25 06:10, Maud Spierings via B4 Relay wrote:
>> From: Maud Spierings <maudspierings@...ontroll.com>
>>
>> When the ntc is reading Out Of Bounds instead of clipping to the nearest
>> limit (min/max) return -ENODATA. This prevents malfunctioning sensors
>> from sending a device into a shutdown loop due to a critical trip.
>>
>> This implementation will only work for ntc type thermistors if a ptc
>> type is to be implemented the min/max ohm calculation must be adjusted
>> to take that into account.
>>
>> Signed-off-by: Maud Spierings <maudspierings@...ontroll.com>
>> ---
>> This patch is a continuation of another discussion [1]. I felt like it
>> should be a new patch, not a v2 as this is a very different change.
thoughts about the theoretic code comment here.
>> I have left the clamping of n to INT_MAX in the code with a comment, but
>> it may be possible to find a better solution to it. One I thought of is
>> to make the ohm field of the ntc_compensation struct a signed int as
>> well. It would get rid of this weird edge case, but it doesn't make
>> sense to allow for negative resistances to be entered into the sensor
>> table.
>>
>> Currently the only feedback this provides to the user is when they
>> manually try to read the temperature and it returns the error. I have
>> added a simple printk to these error points to see how spammy it gets
>> and this is the result:
>>
>> dmesg | grep hwmon
>> [    4.982682] hwmon: sensor out of bounds
>> [    5.249758] hwmon: sensor out of bounds
>> [    5.633729] hwmon: sensor out of bounds
>> [    6.215285] hwmon: sensor out of bounds
>> [    7.073882] hwmon: sensor out of bounds
>> [    7.486620] hwmon: sensor out of bounds
>> [    8.833765] hwmon: sensor out of bounds
>> [   10.785969] hwmon: sensor out of bounds
>> [   13.793722] hwmon: sensor out of bounds
>> [   16.761124] hwmon: sensor out of bounds
>> [   17.889706] hwmon: sensor out of bounds
>> [   25.057715] hwmon: sensor out of bounds
>> [   35.041725] hwmon: sensor out of bounds
>> [   50.110346] hwmon: sensor out of bounds
>> [   72.945283] hwmon: sensor out of bounds
>> [  105.712619] hwmon: sensor out of bounds
>> [  154.863976] hwmon: sensor out of bounds
>> [  164.937104] hwmon: sensor out of bounds
>> [  228.590909] hwmon: sensor out of bounds
>> [  315.365777] hwmon: sensor out of bounds
>> [  464.718403] hwmon: sensor out of bounds
>> [  615.079123] hwmon: sensor out of bounds
>> [  764.496780] hwmon: sensor out of bounds
>>
>> This is with polling-delay set to 1000, it seems to rate-limit itself?
>> But I feel there should be a better way to communicate the potential
>> sensor failure to the user, but I can't think of anything.
>>
>
>Hmm ... we could add "fault" attributes and report a sensor fault
>if uv == 0 or uv >= puv, together with the -ENODATA temperature reading.
>That could distinguish the fault from the "resistor value out of range" case.

I've done some working around that fault attribute, but I can't seem to
find a satisfactory way to implement it. This fault also is not triggered
if anything is disconnected, maybe the 0 case but I can't even get that.

I think leaving it at this current solution is good for now.

>> [1]: https://lore.kernel.org/all/20250304-ntc_min_max-v1-1-b08e70e56459@gocontroll.com/
>
>Most of the above should be after "---" since it is irrelevant for the commit log.

I believe my cover letter seperated it correctly after my signed-off-by
tag.

>> ---
>>   drivers/hwmon/ntc_thermistor.c | 19 +++++++++++++------
>>   1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/hwmon/ntc_thermistor.c b/drivers/hwmon/ntc_thermistor.c
>> index 0d29c8f97ba7c2f264588b6309b91ca494012ad6..311a60498d409ea068a18590415b2d5b43e73eb1 100644
>> --- a/drivers/hwmon/ntc_thermistor.c
>> +++ b/drivers/hwmon/ntc_thermistor.c
>> @@ -387,12 +387,9 @@ static int get_ohm_of_thermistor(struct ntc_data *data, unsigned int uv)
>>        puo = data->pullup_ohm;
>>        pdo = data->pulldown_ohm;
>>  
>> -     if (uv == 0)
>> -             return (data->connect == NTC_CONNECTED_POSITIVE) ?
>> -                     INT_MAX : 0;
>> -     if (uv >= puv)
>> -             return (data->connect == NTC_CONNECTED_POSITIVE) ?
>> -                     0 : INT_MAX;
>> +     /* faulty adc value */
>> +     if (uv == 0 || uv >= puv)
>> +             return -ENODATA;
>>  
>>        if (data->connect == NTC_CONNECTED_POSITIVE && puo == 0)
>>                n = div_u64(pdo * (puv - uv), uv);
>> @@ -404,6 +401,16 @@ static int get_ohm_of_thermistor(struct ntc_data *data, unsigned int uv)
>>        else
>>                n = div64_u64_safe(pdo * puo * uv, pdo * (puv - uv) - puo * uv);
>>  
>> +     /* sensor out of bounds */
>> +     if (n > data->comp[0].ohm || n < data->comp[data->n_comp-1].ohm)
>> +             return -ENODATA;
>> +
>> +     /*
>> +      * theoretically data->comp[0].ohm can be greater than INT_MAX as it is an
>> +      * unsigned integer, but it doesn't make any sense for it to be so as the
>> +      * maximum return value of this function is INT_MAX, so it will never be
>> +      * able to properly calculate that temperature.
>> +      */
>>        if (n > INT_MAX)
>>                n = INT_MAX;
>
>I am not a friend of theoretic code or comments like this. Please drop.
>The original code was intended to catch out-of-bounds values which would
>otherwise have been reported as error, not to catch unrealistic resistor
>values in the compensation tables.

So, drop the check and comment? Or just drop the comment? I was thinking
to fully remove that check in an earlier comment in my cover letter.

Kind regards,
Maud

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ