lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3c6cbaf3-187b-1682-69b8-a2b34f23b928@gmail.com>
Date:   Thu, 17 Jun 2021 10:11:19 +0300
From:   Dmitry Osipenko <digetx@...il.com>
To:     Guenter Roeck <linux@...ck-us.net>
Cc:     Jean Delvare <jdelvare@...e.com>, linux-kernel@...r.kernel.org,
        linux-hwmon@...r.kernel.org
Subject: Re: [PATCH v1] hwmon: (lm90) Use edge-triggered interrupt

17.06.2021 03:12, Guenter Roeck пишет:
> On Wed, Jun 16, 2021 at 10:07:08PM +0300, Dmitry Osipenko wrote:
>> The LM90 driver uses level-based interrupt triggering. The interrupt
>> handler prints a warning message about the breached temperature and
>> quits. There is no way to stop interrupt from re-triggering since it's
>> level-based, thus thousands of warning messages are printed per second
>> once interrupt is triggered. Use edge-triggered interrupt in order to
>> fix this trouble.
>>
>> Fixes: 109b1283fb532 ("hwmon: (lm90) Add support to handle IRQ")
>> Signed-off-by: Dmitry Osipenko <digetx@...il.com>
>> ---
>>  drivers/hwmon/lm90.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
>> index ebbfd5f352c0..ce8ebe60fcdc 100644
>> --- a/drivers/hwmon/lm90.c
>> +++ b/drivers/hwmon/lm90.c
>> @@ -1908,7 +1908,7 @@ static int lm90_probe(struct i2c_client *client)
>>  		dev_dbg(dev, "IRQ: %d\n", client->irq);
>>  		err = devm_request_threaded_irq(dev, client->irq,
>>  						NULL, lm90_irq_thread,
>> -						IRQF_TRIGGER_LOW | IRQF_ONESHOT,
>> +						IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
>>  						"lm90", client);
> 
> We can't do that. Problem is that many of the devices supported by this driver
> behave differently when it comes to interrupts. Specifically, the interrupt
> handler is supposed to reset the interrupt condition (ie reading the status
> register should reset it). If that is the not the case for a specific chip,
> we'll have to update the code to address the problem for that specific chip.
> The above code would probably just generate a single interrupt while never
> resetting the interrupt condition, which is obviously not what we want to
> happen.

The nct1008/72 datasheet [1] says that reading the status register
doesn't reset interrupt until temperature is returned back into normal
state, which is what I'm witnessing.

[1] https://www.onsemi.com/pdf/datasheet/nct1008-d.pdf

Page 10 "Status Register":

"Reading the status register clears the five flags, Bit 6 to Bit 2,
provided the error conditions causing the flags to beset  have  gone
away.  A  flag  bit  can  be  reset  only  if  the corresponding
value    register    contains    an    in-limit measurement or if the
sensor is good."

So the interrupt handler doesn't actually stop interrupt from
reoccurring and the whole KMSG is instantly spammed with:

...
[  217.484034] lm90 0-004c: temp2 out of range, please check!
[  217.484569] lm90 0-004c: temp2 out of range, please check!
[  217.485006] systemd-journald[179]: /dev/kmsg buffer overrun, some
messages lost.
[  217.485109] lm90 0-004c: temp2 out of range, please check!
[  217.485699] lm90 0-004c: temp2 out of range, please check!
[  217.486235] lm90 0-004c: temp2 out of range, please check!
[  217.486776] lm90 0-004c: temp2 out of range, please check!
[  217.486874] systemd-journald[179]: /dev/kmsg buffer overrun, ...

It's interesting that the very first version of the nct1008-support
patch used edge-triggered interrupt flags [2].

[2] http://lkml.iu.edu/hypermail/linux/kernel/1104.1/01669.html

Limiting the interrupt rate could be an alternative solution.

What do you think about something like this:

diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c
index ce8ebe60fcdc..74886b8066ab 100644
--- a/drivers/hwmon/lm90.c
+++ b/drivers/hwmon/lm90.c
@@ -79,6 +79,7 @@
  * concern all supported chipsets, unless mentioned otherwise.
  */

+#include <linux/delay.h>
 #include <linux/module.h>
 #include <linux/init.h>
 #include <linux/slab.h>
@@ -201,6 +202,9 @@ enum chips { lm90, adm1032, lm99, lm86, max6657,
max6659, adt7461, max6680,
 #define MAX6696_STATUS2_R2OT2	(1 << 6) /* remote2 emergency limit
tripped */
 #define MAX6696_STATUS2_LOT2	(1 << 7) /* local emergency limit tripped */

+/* Prevent instant interrupt re-triggering */
+#define LM90_IRQ_DELAY		(15 * MSEC_PER_SEC)
+
 /*
  * Driver data (common to all clients)
  */
@@ -1756,10 +1760,12 @@ static irqreturn_t lm90_irq_thread(int irq, void
*dev_id)
 	struct i2c_client *client = dev_id;
 	u16 status;

-	if (lm90_is_tripped(client, &status))
-		return IRQ_HANDLED;
-	else
+	if (!lm90_is_tripped(client, &status))
 		return IRQ_NONE;
+
+	msleep(LM90_IRQ_DELAY);
+
+	return IRQ_HANDLED;
 }

 static void lm90_remove_pec(void *dev)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ