lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <184d5217-149f-4f6a-b4a6-f897a8391873@roeck-us.net>
Date: Tue, 4 Mar 2025 07:04:04 -0800
From: Guenter Roeck <linux@...ck-us.net>
To: maudspierings@...ontroll.com, Jean Delvare <jdelvare@...e.com>
Cc: linux-hwmon@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hwmon: (ntc_thermistor) return error instead of clipping
 on OOB

On 3/4/25 06:10, Maud Spierings via B4 Relay wrote:
> From: Maud Spierings <maudspierings@...ontroll.com>
> 
> When the ntc is reading Out Of Bounds instead of clipping to the nearest
> limit (min/max) return -ENODATA. This prevents malfunctioning sensors
> from sending a device into a shutdown loop due to a critical trip.
> 
> This implementation will only work for ntc type thermistors if a ptc
> type is to be implemented the min/max ohm calculation must be adjusted
> to take that into account.
> 
> Signed-off-by: Maud Spierings <maudspierings@...ontroll.com>
> ---
> This patch is a continuation of another discussion [1]. I felt like it
> should be a new patch, not a v2 as this is a very different change.
> 
> I have left the clamping of n to INT_MAX in the code with a comment, but
> it may be possible to find a better solution to it. One I thought of is
> to make the ohm field of the ntc_compensation struct a signed int as
> well. It would get rid of this weird edge case, but it doesn't make
> sense to allow for negative resistances to be entered into the sensor
> table.
> 
> Currently the only feedback this provides to the user is when they
> manually try to read the temperature and it returns the error. I have
> added a simple printk to these error points to see how spammy it gets
> and this is the result:
> 
> dmesg | grep hwmon
> [    4.982682] hwmon: sensor out of bounds
> [    5.249758] hwmon: sensor out of bounds
> [    5.633729] hwmon: sensor out of bounds
> [    6.215285] hwmon: sensor out of bounds
> [    7.073882] hwmon: sensor out of bounds
> [    7.486620] hwmon: sensor out of bounds
> [    8.833765] hwmon: sensor out of bounds
> [   10.785969] hwmon: sensor out of bounds
> [   13.793722] hwmon: sensor out of bounds
> [   16.761124] hwmon: sensor out of bounds
> [   17.889706] hwmon: sensor out of bounds
> [   25.057715] hwmon: sensor out of bounds
> [   35.041725] hwmon: sensor out of bounds
> [   50.110346] hwmon: sensor out of bounds
> [   72.945283] hwmon: sensor out of bounds
> [  105.712619] hwmon: sensor out of bounds
> [  154.863976] hwmon: sensor out of bounds
> [  164.937104] hwmon: sensor out of bounds
> [  228.590909] hwmon: sensor out of bounds
> [  315.365777] hwmon: sensor out of bounds
> [  464.718403] hwmon: sensor out of bounds
> [  615.079123] hwmon: sensor out of bounds
> [  764.496780] hwmon: sensor out of bounds
> 
> This is with polling-delay set to 1000, it seems to rate-limit itself?
> But I feel there should be a better way to communicate the potential
> sensor failure to the user, but I can't think of anything.
> 

Hmm ... we could add "fault" attributes and report a sensor fault
if uv == 0 or uv >= puv, together with the -ENODATA temperature reading.
That could distinguish the fault from the "resistor value out of range" case.

> [1]: https://lore.kernel.org/all/20250304-ntc_min_max-v1-1-b08e70e56459@gocontroll.com/

Most of the above should be after "---" since it is irrelevant for the commit log.

> ---
>   drivers/hwmon/ntc_thermistor.c | 19 +++++++++++++------
>   1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/hwmon/ntc_thermistor.c b/drivers/hwmon/ntc_thermistor.c
> index 0d29c8f97ba7c2f264588b6309b91ca494012ad6..311a60498d409ea068a18590415b2d5b43e73eb1 100644
> --- a/drivers/hwmon/ntc_thermistor.c
> +++ b/drivers/hwmon/ntc_thermistor.c
> @@ -387,12 +387,9 @@ static int get_ohm_of_thermistor(struct ntc_data *data, unsigned int uv)
>   	puo = data->pullup_ohm;
>   	pdo = data->pulldown_ohm;
>   
> -	if (uv == 0)
> -		return (data->connect == NTC_CONNECTED_POSITIVE) ?
> -			INT_MAX : 0;
> -	if (uv >= puv)
> -		return (data->connect == NTC_CONNECTED_POSITIVE) ?
> -			0 : INT_MAX;
> +	/* faulty adc value */
> +	if (uv == 0 || uv >= puv)
> +		return -ENODATA;
>   
>   	if (data->connect == NTC_CONNECTED_POSITIVE && puo == 0)
>   		n = div_u64(pdo * (puv - uv), uv);
> @@ -404,6 +401,16 @@ static int get_ohm_of_thermistor(struct ntc_data *data, unsigned int uv)
>   	else
>   		n = div64_u64_safe(pdo * puo * uv, pdo * (puv - uv) - puo * uv);
>   
> +	/* sensor out of bounds */
> +	if (n > data->comp[0].ohm || n < data->comp[data->n_comp-1].ohm)
> +		return -ENODATA;
> +
> +	/*
> +	 * theoretically data->comp[0].ohm can be greater than INT_MAX as it is an
> +	 * unsigned integer, but it doesn't make any sense for it to be so as the
> +	 * maximum return value of this function is INT_MAX, so it will never be
> +	 * able to properly calculate that temperature.
> +	 */
>   	if (n > INT_MAX)
>   		n = INT_MAX;

I am not a friend of theoretic code or comments like this. Please drop.
The original code was intended to catch out-of-bounds values which would
otherwise have been reported as error, not to catch unrealistic resistor
values in the compensation tables.

Thanks,
Guenter


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ