lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20240116073502.2356090-1-leobras@redhat.com>
Date: Tue, 16 Jan 2024 04:34:57 -0300
From: Leonardo Bras <leobras@...hat.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jiri Slaby <jirislaby@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
	Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
	Florian Fainelli <f.fainelli@...il.com>,
	John Ogness <john.ogness@...utronix.de>,
	Tony Lindgren <tony@...mide.com>,
	Marcelo Tosatti <mtosatti@...hat.com>
Cc: Leonardo Bras <leobras@...hat.com>,
	linux-kernel@...r.kernel.org,
	linux-serial@...r.kernel.org
Subject: [RFC PATCH v1 1/2] irq/spurious: Reset irqs_unhandled if an irq_thread handles one IRQ request

When the IRQs are threaded, the part of the handler that runs in
interruption context can be pretty fast, as per design, while letting the
slow part to run into the thread handler.

In some cases, given IRQs can be triggered too fast, making it impossible
for the irq_thread to be able to keep up handling every request.

If two requests happen before any irq_thread handler is able to finish,
no increment to threads_handled happen, causing threads_handled and
threads_handled_last to be equal, which will ends up
causing irqs_unhandled to be incremented in note_interrupt().

Once irqs_unhandled gets to ~100k, the IRQ line gets disabled, disrupting
the device work.

As of today, the only way to reset irqs_unhandled before disabling the IRQ
line is to stay 100ms without having any increment to irqs_unhandled, which
can be pretty hard to happen if the IRQ is very busy.

On top of that, some irq_thread handlers can handle requests in batches,
effectively incrementing threads_handled only once despite dealing with a
lot of requests, which make the irqs_unhandled to reach 100k pretty fast
if the IRQ is getting a lot of requests.

This IRQ line disable bug can be easily reproduced with a serial8250
console on a PREEMPT_RT kernel: it only takes the user to print a lot
of text to the console (or to ttyS0): around 300k chars should be enough.

To fix this bug, reset irqs_unhandled whenever irq_thread handles at least
one IRQ request.

This fix makes possible to avoid disabling IRQs which irq_thread handlers
can take long (while on heavy usage of the IRQ line), without losing the
ability of disabling IRQs that actually get unhandled for too long.

Signed-off-by: Leonardo Bras <leobras@...hat.com>
---
 kernel/irq/spurious.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c
index 02b2daf074414..b60748f89738a 100644
--- a/kernel/irq/spurious.c
+++ b/kernel/irq/spurious.c
@@ -339,6 +339,14 @@ void note_interrupt(struct irq_desc *desc, irqreturn_t action_ret)
 			handled |= SPURIOUS_DEFERRED;
 			if (handled != desc->threads_handled_last) {
 				action_ret = IRQ_HANDLED;
+
+				/*
+				 * If the thread handlers handle
+				 * one IRQ reset the unhandled
+				 * IRQ counter.
+				 */
+				desc->irqs_unhandled = 0;
+
 				/*
 				 * Note: We keep the SPURIOUS_DEFERRED
 				 * bit set. We are handling the
-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ