[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pnf342pr.fsf@nanos.tec.linutronix.de>
Date:   Tue, 28 Jan 2020 23:48:32 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Evan Green <evgreen@...omium.org>
Cc:     Rajat Jain <rajatja@...gle.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        linux-pci <linux-pci@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        x86@...nel.org, Marc Zyngier <maz@...nel.org>
Subject: Re: [PATCH v2] PCI/MSI: Avoid torn updates to MSI pairs
Evan,
Evan Green <evgreen@...omium.org> writes:
> On Tue, Jan 28, 2020 at 6:38 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>> The patch is only lightly tested, but so far it survived.
>>
>
> Hi Thomas,
> Thanks for the patch, I gave it a try. I get the following splat, then a hang:
>
> [   62.238406]        CPU0
> [   62.241135]        ----
> [   62.243863]   lock(vector_lock);
> [   62.247467]   lock(vector_lock);
> [   62.251071]
> [   62.251071]  *** DEADLOCK ***
> [   62.251071]
> [   62.257687]  May be due to missing lock nesting notation
> [   62.257687]
> [   62.265274] 2 locks held by migration/1/17:
> [   62.269946]  #0: 00000000cfa9d8c3 (&irq_desc_lock_class){-.-.}, at:
> irq_migrate_all_off_this_cpu+0x44/0x28f
> [   62.280846]  #1: 000000006885da2d (vector_lock){-.-.}, at:
> msi_set_affinity+0x13c/0x27b
> [   62.289801]
> [   62.289801] stack backtrace:
> [   62.294669] CPU: 1 PID: 17 Comm: migration/1 Not tainted 4.19.96 #2
> [   62.310713] Call Trace:
> [   62.313446]  dump_stack+0xac/0x11e
> [   62.317255]  __lock_acquire+0x64f/0x19bc
> [   62.321646]  ? find_held_lock+0x3d/0xb8
> [   62.325936]  ? pci_conf1_write+0x4f/0xdf
> [   62.330320]  lock_acquire+0x1b2/0x1fa
> [   62.334413]  ? apic_retrigger_irq+0x31/0x63
> [   62.339097]  _raw_spin_lock_irqsave+0x51/0x7d
> [   62.343972]  ? apic_retrigger_irq+0x31/0x63
> [   62.348646]  apic_retrigger_irq+0x31/0x63
> [   62.353124]  msi_set_affinity+0x25a/0x27b
Bah. I'm sure I looked at that call chain, noticed the double vector
lock and then forgot. Delta patch below.
Thanks,
        tglx
8<--------------
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -64,6 +64,7 @@ msi_set_affinity(struct irq_data *irqd,
 	struct irq_cfg old_cfg, *cfg = irqd_cfg(irqd);
 	struct irq_data *parent = irqd->parent_data;
 	unsigned int cpu;
+	bool pending;
 	int ret;
 
 	/* Save the current configuration */
@@ -147,9 +148,13 @@ msi_set_affinity(struct irq_data *irqd,
 	 * vector/CPU. Check whether the transition raced with a device
 	 * interrupt and is pending in the local APICs IRR.
 	 */
-	if (lapic_vector_set_in_irr(cfg->vector))
-		irq_data_get_irq_chip(irqd)->irq_retrigger(irqd);
+	pending = lapic_vector_set_in_irr(cfg->vector);
+
 	unlock_vector_lock();
+
+	if (pending)
+		irq_data_get_irq_chip(irqd)->irq_retrigger(irqd);
+
 	return ret;
 }
 
Powered by blists - more mailing lists
 
