[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 05 May 2020 23:47:26 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: "Raj\, Ashok" <ashok.raj@...el.com>
Cc: "Raj\, Ashok" <ashok.raj@...ux.intel.com>,
Evan Green <evgreen@...omium.org>,
Mathias Nyman <mathias.nyman@...ux.intel.com>, x86@...nel.org,
linux-pci <linux-pci@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
"Ghorai\, Sukumar" <sukumar.ghorai@...el.com>,
"Amara\, Madhusudanarao" <madhusudanarao.amara@...el.com>,
"Nandamuri\, Srikanth" <srikanth.nandamuri@...el.com>,
Ashok Raj <ashok.raj@...el.com>
Subject: Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug
Ashok,
"Raj, Ashok" <ashok.raj@...el.com> writes:
> On Tue, May 05, 2020 at 09:36:04PM +0200, Thomas Gleixner wrote:
>> The way it works is:
>>
>> 1) New vector on different CPU is allocated
>>
>> 2) New vector is installed
>>
>> 3) When the first interrupt on the new CPU arrives then the cleanup
>> IPI is sent to the previous CPU which tears down the old vector
>>
>> So if the interrupt is sent to the original CPU just before #2 then this
>> will be correctly handled on that CPU after the set affinity code
>> reenables interrupts.
>
> I'll have this test tried out.. but in msi_set_affinity() the check below
> for lapic_vector_set_in_irr(cfg->vector), this is the new vector correct?
> don't we want to check for old_cfg.vector instead?
>
> msi_set_affinit ()
> {
> ....
> unlock_vector_lock();
>
> /*
> * Check whether the transition raced with a device interrupt and
> * is pending in the local APICs IRR. It is safe to do this outside
> * of vector lock as the irq_desc::lock of this interrupt is still
> * held and interrupts are disabled: The check is not accessing the
> * underlying vector store. It's just checking the local APIC's
> * IRR.
> */
> if (lapic_vector_set_in_irr(cfg->vector))
> irq_data_get_irq_chip(irqd)->irq_retrigger(irqd);
No. This catches the transitional interrupt to the new vector on the
original CPU, i.e. the one which is running that code.
Again the steps are:
1) Allocate new vector on new CPU
2) Set new vector on original CPU
3) Set new vector on new CPU
So we have 3 points where an interrupt can fire:
A) Before #2
B) After #2 and before #3
C) After #3
#A is hitting the old vector which is still valid on the old CPU and
will be handled once interrupts are enabled with the correct irq
descriptor - Normal operation (same as with maskable MSI)
#B This must be checked in the IRR because the there is no valid vector
on the old CPU.
#C is handled on the new vector on the new CPU
Thanks,
tglx
Powered by blists - more mailing lists