[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1344379147.27383.29.camel@sbsiddha-desk.sc.intel.com>
Date: Tue, 07 Aug 2012 15:39:07 -0700
From: Suresh Siddha <suresh.b.siddha@...el.com>
To: Borislav Petkov <bp@...64.org>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
Robert Richter <robert.richter@....com>, mingo@...nel.org,
hpa@...or.com, linux-kernel@...r.kernel.org,
akpm@...ux-foundation.org, torvalds@...ux-foundation.org,
a.p.zijlstra@...llo.nl, tglx@...utronix.de,
linux-tip-commits@...r.kernel.org
Subject: Re: do_IRQ: 1.55 No irq handler for vector (irq -1)
On Tue, 2012-08-07 at 22:57 +0200, Borislav Petkov wrote:
> The funny thing is, they deliver to all CPUs except the BSP.
Looking at your /proc/interrupts below, probably it is using some sort
of round-robin.
> Or maybe the BSP gets that IRQ too but it actually has a handler
> registered?
from /proc/interrupts you sent, bsp is also getting those.
>
> Btw, I'm stabbing in the dark here - I have been purposefully and
> willfully keeping away from all the APIC debacle until now. I guess that
> carefree time is over :(.
>
> > Certainly outside of x2apic mode I have seen that happen and that is why
> > the reservation in lowest priroity delivery mode was for the same vector
> > across all cpus.
> >
> > This certainly looks like we have one irq going across multiple cpus
> > and the software simply appears unprepared for the irq to show up where
> > the irq is showing up.
>
> The interesting thing is that this happens once per core early during
> boot and not anymore. I dropped the printk_ratelimit() in do_IRQ and
> still got those lines only once in dmesg.
What it says is the interrupts are arriving at the offline cpu's aswell.
In the pre 3.6-rc1 the vector that is assigned to the legacy irq's are
fixed (IRQ0_VECTOR, ...).
For the 3.6-rc1, we allow the vector to change when the IO-APIC starts
to handle and probably that is a bad idea given that this platform is
spraying interrupts (mostly timer?) on to the offline cpu's aswell. Pre
3.6 we handle those interrupts when we come online. Now in the new
kernels, as the vector has changed when that irq is handled by the
io-apic mode we get a spurious no irq handler for vector message.
> The other funny thing is, irq 55 is not in /proc/interrupts:
'55' is the vector number. You have to add some debug code in the kernel
to identify what irq it used to belong to.
>
> CPU0 CPU1 CPU2 CPU3
> 0: 44 0 0 0 IO-APIC-edge timer
> 1: 2 1 2 4 IO-APIC-edge i8042
> 8: 6 7 6 6 IO-APIC-edge rtc0
> 9: 22 25 24 21 IO-APIC-fasteoi acpi
> 12: 31 23 30 30 IO-APIC-edge i8042
> 16: 82 82 81 117 IO-APIC-fasteoi snd_hda_intel
> 17: 0 1 1 0 IO-APIC-fasteoi ehci_hcd:usb1, ehci_hcd:usb2
> 18: 3 6 8 8 IO-APIC-fasteoi ohci_hcd:usb3, ohci_hcd:usb4, ohci_hcd:usb5
> 40: 0 0 0 0 PCI-MSI-edge PCIe PME
> 41: 0 0 0 0 PCI-MSI-edge PCIe PME
> 42: 0 0 0 0 PCI-MSI-edge PCIe PME
> 43: 0 0 0 0 PCI-MSI-edge PCIe PME
> 44: 675 662 676 690 PCI-MSI-edge ahci
> 45: 41 44 38 41 PCI-MSI-edge snd_hda_intel
> 46: 13484 13499 13501 13536 PCI-MSI-edge eth0
> NMI: 0 0 0 0 Non-maskable interrupts
> LOC: 20719 21487 18015 16445 Local timer interrupts
> SPU: 0 0 0 0 Spurious interrupts
> PMI: 0 0 0 0 Performance monitoring interrupts
> IWI: 0 0 0 0 IRQ work interrupts
> RTR: 0 0 0 0 APIC ICR read retries
> RES: 13744 12640 13425 12334 Rescheduling interrupts
> CAL: 571 790 539 801 Function call interrupts
> TLB: 0 0 0 0 TLB shootdowns
> TRM: 0 0 0 0 Thermal event interrupts
> THR: 0 0 0 0 Threshold APIC interrupts
> MCE: 0 0 0 0 Machine check exceptions
> MCP: 66 66 66 66 Machine check polls
> ERR: 0
> MIS: 0
>
> so what is that thing?
And incase of Robert's SATA hang case, as we modify the vector when the
irq is handled by the io-apic, it sets cfg->move_in_progress during
setup_ioapic_irq() and later when we do setup_ioapic_dest() to update
the SMP affinity, we fail to update the RTE's as the
cfg->move_in_progress is still set (which gets cleared after the first
interrupt arrives).
And in case of Robert's system, all the interrupts go only to the last
cpu (cpu-7). As we fail to update the RTE's with the smp affinity in the
setup_ioapic_dest(), RTE is still pointing to cpu-0 (but the
vector_to_irq mapping is set on all the cpu's) and most likely Robert's
platform for some reason doesn't like it (though we don't see no irq
handler messages on Robert's platform).
Boris, Robert, can you check if the below patch makes both of your
systems happy again (essentially not allowing the vector to change for
legacy irq's, which also allows the RTE to be set correctly in the smp
case etc)? Based on your results and some more thinking, I will send a
detailed patch with changelog tomorrow.
arch/x86/kernel/apic/io_apic.c | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index a6c64aa..4b98610 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1356,6 +1356,15 @@ static void setup_ioapic_irq(unsigned int irq, struct irq_cfg *cfg,
if (!IO_APIC_IRQ(irq))
return;
+ /*
+ * For legacy irqs, cfg->domain starts with cpu 0. Now that IO-APIC
+ * can handle this irq and the apic driver is finialized at this point,
+ * update the cfg->domain.
+ */
+ if (irq < legacy_pic->nr_legacy_irqs &&
+ cpumask_equal(cfg->domain, cpumask_of(0)))
+ apic->vector_allocation_domain(0, cfg->domain, cpu_online_mask);
+
if (assign_irq_vector(irq, cfg, apic->target_cpus()))
return;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists