[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c55b15c8-df49-6458-56ea-a753ae578d18@gmail.com>
Date: Mon, 21 Feb 2022 19:13:01 +0300
From: Dmitry Osipenko <digetx@...il.com>
To: Guenter Roeck <linux@...ck-us.net>,
Jon Hunter <jonathanh@...dia.com>,
Jean Delvare <jdelvare@...e.com>
Cc: linux-kernel@...r.kernel.org, linux-hwmon@...r.kernel.org,
linux-tegra@...r.kernel.org
Subject: Re: [PATCH v3 2/4] hwmon: (lm90) Use hwmon_notify_event()
21.02.2022 19:02, Guenter Roeck пишет:
> On 2/21/22 07:49, Jon Hunter wrote:
>>
>> On 21/02/2022 15:43, Guenter Roeck wrote:
>>
>> ...
>>
>>>> We observed a random null pointer deference crash somewhere in the
>>>> thermal core (crash log below is not very helpful) when calling
>>>> mutex_lock(). It looks like we get an interrupt when this crash
>>>> happens.
>>>>
>>>> Looking at the lm90 driver, per the above, I now see we are calling
>>>> hwmon_notify_event() from the lm90 interrupt handler. Looking at
>>>> hwmon_notify_event() I see that ...
>>>>
>>>> hwmon_notify_event()
>>>> --> hwmon_thermal_notify()
>>>> --> thermal_zone_device_update()
>>>> --> update_temperature()
>>>> --> mutex_lock()
>>>>
>>>> So although I don't completely understand the crash, it does seem
>>>> that we should not be calling hwmon_notify_event() from the
>>>> interrupt handler.
>>>>
>>> As mentioned separately, this is not the problem.
>>
>> Yes I can see that now.
>>
>>> I think the problem may be that this is not a devicetree system
>>> (or the lm90 devide does not have a devicetree node), but thermal
>>> notification currently only works in such systems because the hwmon
>>> subsystem uses the devicetree registration method. At the same time,
>>> CONFIG_THERMAL_OF is obviously enabled. Unfortunately, the hwmon code
>>> does not bail out in that situation due to another bug.
>>
>> The platform I see this on does use device-tree and it does have a
>> node for the ti,tmp451 device which uses the lm90 device. This
>> platform uses the device-tree source
>> arch/arm64/boot/dts/nvidia/tegra194-p2972-0000.dts and the tmp451 node
>> is in arch/arm64/boot/dts/nvidia/tegra194-p2888.dtsi.
>>
>
> Interesting. It appears that the call to
> devm_thermal_zone_of_sensor_register()
> in the hwmon core nevertheless returns -ENODEV which is not handled
> properly
> in the hwmon core. I can see a number of reasons for this to happen:
> - there is no devicetree node for the lm90 device
> - there is no thermal-zones devicetree node
> - there is no thermal zone entry in the thermal-zones node which matches
> the sensor
>
> We'll have to revert the lm90 changes until this is sorted out.
Oh, yeah. Seems there is a problem there and tzd pointer could be
-ENODEV. But it's a hwmon core problem, which apparently existed for a
long time, not the lm90 problem.
Powered by blists - more mailing lists