[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200511190341.GA95413@otc-nc-03>
Date: Mon, 11 May 2020 12:03:41 -0700
From: "Raj, Ashok" <ashok.raj@...el.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: "Raj, Ashok" <ashok.raj@...ux.intel.com>,
Evan Green <evgreen@...omium.org>,
Mathias Nyman <mathias.nyman@...ux.intel.com>, x86@...nel.org,
linux-pci <linux-pci@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
"Ghorai, Sukumar" <sukumar.ghorai@...el.com>,
"Amara, Madhusudanarao" <madhusudanarao.amara@...el.com>,
"Nandamuri, Srikanth" <srikanth.nandamuri@...el.com>,
Ashok Raj <ashok.raj@...el.com>
Subject: Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug
Hi Thomas,
On Fri, May 08, 2020 at 06:49:15PM +0200, Thomas Gleixner wrote:
> Ashok,
>
> "Raj, Ashok" <ashok.raj@...el.com> writes:
> > With legacy MSI we can have these races and kernel is trying to do the
> > song and dance, but we see this happening even when IR is turned on.
> > Which is perplexing. I think when we have IR, once we do the change vector
> > and flush the interrupt entry cache, if there was an outstandng one in
> > flight it should be in IRR. Possibly should be clearned up by the
> > send_cleanup_vector() i suppose.
>
> Ouch. With IR this really should never happen and yes the old vector
> will catch one which was raised just before the migration disabled the
> IR entry. During the change nothing can go wrong because the entry is
> disabled and only reenabled after it's flushed which will send a pending
> one to the new vector.
with IR, I'm not sure if we actually mask the interrupt except when
its a Posted Interrupt.
We do an atomic update to IRTE, with cmpxchg_double
ret = cmpxchg_double(&irte->low, &irte->high,
irte->low, irte->high,
irte_modified->low, irte_modified->high);
followed by flushing the interrupt entry cache. After which any
old ones in flight before the flush should be sittig in IRR
on the outgoing cpu.
The send_cleanup_vector() sends IPI to the apic_id->old_cpu which
would be the cpu we are running on correct? and this is a self_ipi
to IRQ_MOVE_CLEANUP_VECTOR.
smp_irq_move_cleanup_interrupt() seems to check IRR with
apicid_prev_vector()
irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
if (irr & (1U << (vector % 32))) {
apic->send_IPI_self(IRQ_MOVE_CLEANUP_VECTOR);
continue;
}
And this would allow any pending IRR bits in the outgoing CPU to
call the relevant ISR's before draining all vectors on the outgoing
CPU.
Does it sound right?
I couldn't quite pin down how the device ISR's are hooked up through
this send_cleanup_vector() and what follows.
Powered by blists - more mailing lists