linux-kernel - Re: [patch 1/5] genirq: Prepare the handling of shared oneshot interrupts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.1102241849310.2701@localhost6.localdomain6>
Date:	Thu, 24 Feb 2011 18:56:19 +0100 (CET)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
cc:	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [patch 1/5] genirq: Prepare the handling of shared oneshot
 interrupts

On Wed, 23 Feb 2011, Linus Torvalds wrote:
> On Wed, Feb 23, 2011 at 3:52 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
> >
> > +               /*
> > +                * We or the mask lockless. Safe because the code
> > +                * which clears the mask is serialized
> > +                * vs. IRQ_INPROGRESS.
> > +                */
> > +               desc->threads_oneshot |= action->thread_mask;
> > +               wake_up_process(action->thread);
> 
> That comment makes no sense.
> 
> What has "code which clears the mask" to do with anything?
> 
> Even if another CPU _sets_ a bit, doing so in parallel will lose bits
> if it's not locked. One CPU will read the old value, set its bit, and
> store the new value: with the race, only one of the bits will be set.
> 
> So maybe the code is safe, but the comment is still totally wrong. You
> had better be safe not just against people clearing bits, you need to
> be safe against ANY OTHER WRITER (clear or set), including very much
> OTHER BITS in the same word.
> 
> So if it really is safe, comment on ALL the cases.

Fair enough, that comment really sucks if you are not familiar with
the gory details of that code.

What about the following replacement, which might be overly detailed,
but that's exactly how it works.

Thanks,

	tglx

Index: linux-2.6-tip/kernel/irq/handle.c
===================================================================
--- linux-2.6-tip.orig/kernel/irq/handle.c
+++ linux-2.6-tip/kernel/irq/handle.c
@@ -57,18 +57,60 @@ static void irq_wake_thread(struct irq_d
 	 * Wake up the handler thread for this action. In case the
 	 * thread crashed and was killed we just pretend that we
 	 * handled the interrupt. The hardirq handler has disabled the
-	 * device interrupt, so no irq storm is lurking.
+	 * device interrupt, so no irq storm is lurking. If the
+	 * RUNTHREAD bit is already set, nothing to do.
 	 */
-	if (!test_bit(IRQTF_DIED, &action->thread_flags) &&
-	    !test_and_set_bit(IRQTF_RUNTHREAD, &action->thread_flags)) {
-		/*
-		 * We or the mask lockless. Safe because the code
-		 * which clears the mask is serialized
-		 * vs. IRQ_INPROGRESS.
-		 */
-		desc->threads_oneshot |= action->thread_mask;
-		wake_up_process(action->thread);
-	}
+	if (test_bit(IRQTF_DIED, &action->thread_flags) ||
+	    test_and_set_bit(IRQTF_RUNTHREAD, &action->thread_flags))
+		return;
+
+	/*
+	 * It's safe to OR the mask lockless here. We have only two
+	 * places which write to threads_oneshot: This code and the
+	 * irq thread.
+	 *
+	 * This code is the hard irq context and can never run on two
+	 * cpus in parallel. If it ever does we have more serious
+	 * problems than this bitmask.
+	 *
+	 * The irq threads of this irq which clear their "running" bit
+	 * in threads_oneshot are serialized via desc->lock against
+	 * each other and they are serialized against this code by
+	 * IRQS_INPROGRESS.
+	 *
+	 * Hard irq handler:
+	 *
+	 *	spin_lock(desc->lock);
+	 *	desc->state |= IRQS_INPROGRESS;
+	 *	spin_unlock(desc->lock);
+	 *	set_bit(IRQTF_RUNTHREAD, &action->thread_flags);
+	 *	desc->threads_oneshot |= mask;
+	 *	spin_unlock(desc->lock);
+	 *	desc->state &= ~IRQS_INPROGRESS;
+	 *	spin_unlock(desc->lock);
+	 *
+	 * irq thread:
+	 *
+	 * again:
+	 *	spin_lock(desc->lock);
+	 *	if (desc->state & IRQS_INPROGRESS) {
+	 *		spin_unlock(desc->lock);
+	 *		while(desc->state & IRQS_INPROGRESS)
+	 *			cpu_relax();
+	 *		goto again;
+	 *	}
+	 *	if (!test_bit(IRQTF_RUNTHREAD, &action->thread_flags))
+	 *		desc->threads_oneshot &= ~mask;
+	 *	spin_unlock(desc->lock);
+	 *
+	 * So either the thread waits for us to clear IRQS_INPROGRESS
+	 * or we are waiting in the flow handler for desc->lock to be
+	 * released before we reach this point. The thread also checks
+	 * IRQTF_RUNTHREAD under desc->lock. If set it leaves
+	 * threads_oneshot untouched and runs the thread another time.
+	 */
+	desc->threads_oneshot |= action->thread_mask;
+	wake_up_process(action->thread);
 }
 
 irqreturn_t
Index: linux-2.6-tip/kernel/irq/manage.c
===================================================================
--- linux-2.6-tip.orig/kernel/irq/manage.c
+++ linux-2.6-tip/kernel/irq/manage.c
@@ -637,7 +637,8 @@ again:
 	 *
 	 * This also serializes the state of shared oneshot handlers
 	 * versus "desc->threads_onehsot |= action->thread_mask;" in
-	 * handle_irq_event().
+	 * irq_wake_thread(). See the comment there which explains the
+	 * serialization.
 	 */
 	if (unlikely(desc->istate & IRQS_INPROGRESS)) {
 		raw_spin_unlock_irq(&desc->lock);