linux-kernel - Re: threadirqs deadlocks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87eegdzzez.fsf@nanos.tec.linutronix.de>
Date:   Wed, 17 Mar 2021 14:24:04 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Johan Hovold <johan@...nel.org>
Cc:     Krzysztof Kozlowski <krzysztof.kozlowski@...onical.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Andy Shevchenko <andy.shevchenko@...il.com>,
        linux-serial@...r.kernel.org, linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: threadirqs deadlocks

Johan,

On Tue, Mar 16 2021 at 11:56, Johan Hovold wrote:
> We've gotten reports of lockdep splats correctly identifying a potential
> deadlock in serial drivers when running with forced interrupt threading.
>
> Typically, a serial driver takes the port spin lock in its interrupt
> handler, but unless also disabling interrupts the handler can be
> preempted by another interrupt which can end up calling printk. The
> console code takes then tries to take the port lock and we deadlock.
>
> It seems to me that forced interrupt threading cannot generally work
> without updating drivers that expose locks that can be taken by other
> interrupt handlers, for example, by using spin_lock_irqsave() in their
> interrupt handlers or marking their interrupts as IRQF_NO_THREAD.

The latter is the worst option because that will break PREEMPT_RT.

> What are your thoughts on this given that forced threading isn't that
> widely used and was said to be "mostly a debug option". Do we need to
> vet all current and future drivers and adapt them for "threadirqs"?
>
> Note that we now have people sending cleanup patches for interrupt
> handlers by search-and-replacing spin_lock_irqsave() with spin_lock()
> which can end up exposing this more.

It's true that for !RT it's primarily a debug option, but occasionaly a
very valuable one because it does not take the whole machine down when
something explodes in an interrupt handler. Used it just a couple of
weeks ago successfully :)

So we have several ways out of that:

  1) Do the lock() -> lock_irqsave() dance

  2) Delay printing from hard interrupt context (which is what RT does)

  3) Actually disable interrupts before calling the force threaded
     handler.

I'd say #3 is the right fix here. It's preserving the !RT semantics
and the usefulness of threadirqs for debugging and spare us dealing with
the script kiddies.

Something like the below.

Thanks,

        tglx
---
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1142,11 +1142,15 @@ irq_forced_thread_fn(struct irq_desc *de
 	irqreturn_t ret;
 
 	local_bh_disable();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		local_irq_disable();
 	ret = action->thread_fn(action->irq, action->dev_id);
 	if (ret == IRQ_HANDLED)
 		atomic_inc(&desc->threads_handled);
 
 	irq_finalize_oneshot(desc, action);
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		local_enable();
 	local_bh_enable();
 	return ret;
 }