linux-kernel - Re: [GIT pull] irq updates for 4.13

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1707112307490.15368@nanos>
Date:   Tue, 11 Jul 2017 23:41:52 +0200 (CEST)
From:   Thomas Gleixner <tglx@...x.de>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
cc:     Thomas Gleixner <tglx@...utronix.de>,
        Tony Lindgren <tony@...mide.com>,
        Sebastian Reichel <sebastian.reichel@...labora.co.uk>,
        LKML <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Ingo Molnar <mingo@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>, Pavel Machek <pavel@....cz>,
        Linus Walleij <linus.walleij@...aro.org>,
        Grygorii Strashko <grygorii.strashko@...com>
Subject: Re: [GIT pull] irq updates for 4.13

On Tue, 11 Jul 2017, Linus Torvalds wrote:

> On Tue, Jul 11, 2017 at 10:52 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
> >
> > Not completely, because of the free path issues. See the other mail. Tony
> > confirmed that it works. I wait for Sebastian and queue it with a proper
> > changelog, ok?
> 
> Ugh, I absolutely detest your ugly "bool buslock" parameter to
> irq_release_resources().

Yes, that was a knee jerk reaction to avoid the buslock/unlock dance if the
chip does not have a irq_release_resources() callback. Silly in hindsight
as this is really not a fast path.

> And there seems to be no reason for it.
> 
> Why don't you just move the
> 
>         chip_bus_sync_unlock(desc);
> 
> call in __free_irq() down to just before you release the request_mutex?

See below.

> In fact, looking at __free_irq(), I note that it's locking is
> completely broken shit. Look at the
> 
>                 if (!action) {
>                         WARN(1, "Trying to free already-free IRQ %d\n", irq);
> 
> error case, and look for where it unlocks request_mutex. Yeah, it doesn't.

Yes, I noticed that and my patch fixed it already.

> Why not fix those stupid bugs and clean things up at the same time?
> Make the rule be that as you take the request_mutex lock, you then
> also do the chip_bus_lock().
> 
> And when you release the request_mutex lock, you do
> chip_bus_sync_unlock() just before.
> 
> And no, I have no idea what the locking rules are for
> irq_finalize_oneshot() - it does that chip_bus_lock() without having
> any external serialization. Is that ok? Are the chip handlers able to
> deal with that? Same seems to go for free_percpu_irq().

The extra serialization is only required to protect stuff across
request/free. Everything else is serialized via bus_lock (if the irq chip
has it) and the desc->lock spinlock.

> Comments? Even if they are "Linus, you're way out of line, and you
> can't just move that chip_bus_sync_unlock() down like that because of
> XYZ, you moron".
> 
> For example, it's entirely possible that we can't do the
> "synchronize_irq()" waiting while we hold that chip_bus_lock().  But
> the ones I looked at seemed to all take sleeping locks (or no locks at
> all - doing other things), which implies that they certainly can't be
> blocking irq delivery.

We can't do that move for two reasons:

   1) The data which has been changed between bus_lock/un_lock is cached in
      the irq chip driver private data and needs to go out to the irq chip
      via the slow bus (usually SPI or I2C).

      That's the reason why this bus_lock/unlock magic exists in the first
      place, as you cannot do SPI/I2C transactions while holding desc->lock
      with interrupts disabled.

   2) synchronize_irq() will actually deadlock, if there is a handler on
      flight. These chips use threaded handlers for obvious reasons, as
      they allow to do SPI/I2C communication. When the threaded handler
      returns then bus_lock needs to be taken in irq_finalize_oneshot() as
      we need to talk to the actual irq chip once more. After that the
      threaded handler is marked done, which makes synchronize_irq() return.

      So if we hold bus_lock accross the synchronize_irq() call, the
      handler cannot mark itself done because it blocks on the bus
      lock. That in turn makes synchronize_irq() wait forever on the
      threaded handler to complete....

      Preventing that would require even uglier conditional locking in
      irq_finalize_oneshot(). /me runs and hides

Here is a revised version of the previous patch with the conditional
locking removed and a bunch of comments added.

Thanks,

	tglx

8<----------------------------
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1090,6 +1090,16 @@ setup_irq_thread(struct irqaction *new,
 /*
  * Internal function to register an irqaction - typically used to
  * allocate special interrupts that are part of the architecture.
+ *
+ * Locking rules:
+ *
+ * desc->request_mutex	Provides serialization against a concurrent free_irq()
+ *   chip_bus_lock	Provides serialization for slow bus operations
+ *     desc->lock	Provides serialization against hard interrupts
+ *
+ * chip_bus_lock and desc->lock are sufficient for all other management and
+ * interrupt related functions. desc->request_mutex solely serializes
+ * request/free_irq().
  */
 static int
 __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
@@ -1167,20 +1177,35 @@ static int
 	if (desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)
 		new->flags &= ~IRQF_ONESHOT;
 
+	/*
+	 * Protects against a concurrent __free_irq() call which might wait
+	 * for synchronize_irq() to complete without holding the optional
+	 * chip bus lock and desc->lock.
+	 */
 	mutex_lock(&desc->request_mutex);
+
+	/*
+	 * Acquire bus lock as the irq_request_resources() callback below
+	 * might rely on the serialization or the magic power management
+	 * functions which are abusing the irq_bus_lock() callback,
+	 */
+	chip_bus_lock(desc);
+
+	/* First installed action requests resources. */
 	if (!desc->action) {
 		ret = irq_request_resources(desc);
 		if (ret) {
 			pr_err("Failed to request resources for %s (irq %d) on irqchip %s\n",
 			       new->name, irq, desc->irq_data.chip->name);
-			goto out_mutex;
+			goto out_bus_unlock;
 		}
 	}
 
-	chip_bus_lock(desc);
-
 	/*
 	 * The following block of code has to be executed atomically
+	 * protected against a concurrent interrupt and any of the other
+	 * management calls which are not serialized via
+	 * desc->request_mutex or the optional bus lock.
 	 */
 	raw_spin_lock_irqsave(&desc->lock, flags);
 	old_ptr = &desc->action;
@@ -1286,10 +1311,8 @@ static int
 			ret = __irq_set_trigger(desc,
 						new->flags & IRQF_TRIGGER_MASK);
 
-			if (ret) {
-				irq_release_resources(desc);
+			if (ret)
 				goto out_unlock;
-			}
 		}
 
 		desc->istate &= ~(IRQS_AUTODETECT | IRQS_SPURIOUS_DISABLED | \
@@ -1385,12 +1408,10 @@ static int
 out_unlock:
 	raw_spin_unlock_irqrestore(&desc->lock, flags);
 
-	chip_bus_sync_unlock(desc);
-
 	if (!desc->action)
 		irq_release_resources(desc);
-
-out_mutex:
+out_bus_unlock:
+	chip_bus_sync_unlock(desc);
 	mutex_unlock(&desc->request_mutex);
 
 out_thread:
@@ -1472,6 +1493,7 @@ static struct irqaction *__free_irq(unsi
 			WARN(1, "Trying to free already-free IRQ %d\n", irq);
 			raw_spin_unlock_irqrestore(&desc->lock, flags);
 			chip_bus_sync_unlock(desc);
+			mutex_unlock(&desc->request_mutex);
 			return NULL;
 		}
 
@@ -1498,6 +1520,20 @@ static struct irqaction *__free_irq(unsi
 #endif
 
 	raw_spin_unlock_irqrestore(&desc->lock, flags);
+	/*
+	 * Drop bus_lock here so the changes which were done in the chip
+	 * callbacks above are synced out to the irq chips which hang
+	 * behind a slow bus (I2C, SPI) before calling synchronize_irq().
+	 *
+	 * Aside of that the bus_lock can also be taken from the threaded
+	 * handler in irq_finalize_oneshot() which results in a deadlock
+	 * because synchronize_irq() would wait forever for the thread to
+	 * complete, which is blocked on the bus lock.
+	 *
+	 * The still held desc->request_mutex() protects against a
+	 * concurrent request_irq() of this irq so the release of resources
+	 * and timing data is properly serialized.
+	 */
 	chip_bus_sync_unlock(desc);
 
 	unregister_handler_proc(irq, action);
@@ -1530,8 +1566,15 @@ static struct irqaction *__free_irq(unsi
 		}
 	}
 
+	/* Last action releases resources */
 	if (!desc->action) {
+		/*
+		 * Reaquire bus lock as irq_release_resources() might
+		 * require it to deallocate resources over the slow bus.
+		 */
+		chip_bus_lock(desc);
 		irq_release_resources(desc);
+		chip_bus_sync_unlock(desc);
 		irq_remove_timings(desc);
 	}