linux-kernel - Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Tue, 05 May 2020 23:47:26 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     "Raj\, Ashok" <ashok.raj@...el.com>
Cc:     "Raj\, Ashok" <ashok.raj@...ux.intel.com>,
        Evan Green <evgreen@...omium.org>,
        Mathias Nyman <mathias.nyman@...ux.intel.com>, x86@...nel.org,
        linux-pci <linux-pci@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        "Ghorai\, Sukumar" <sukumar.ghorai@...el.com>,
        "Amara\, Madhusudanarao" <madhusudanarao.amara@...el.com>,
        "Nandamuri\, Srikanth" <srikanth.nandamuri@...el.com>,
        Ashok Raj <ashok.raj@...el.com>
Subject: Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug

Ashok,

"Raj, Ashok" <ashok.raj@...el.com> writes:
> On Tue, May 05, 2020 at 09:36:04PM +0200, Thomas Gleixner wrote:
>> The way it works is:
>> 
>>     1) New vector on different CPU is allocated
>> 
>>     2) New vector is installed
>> 
>>     3) When the first interrupt on the new CPU arrives then the cleanup
>>        IPI is sent to the previous CPU which tears down the old vector
>> 
>> So if the interrupt is sent to the original CPU just before #2 then this
>> will be correctly handled on that CPU after the set affinity code
>> reenables interrupts.
>
> I'll have this test tried out.. but in msi_set_affinity() the check below
> for lapic_vector_set_in_irr(cfg->vector), this is the new vector correct? 
> don't we want to check for old_cfg.vector instead?
>
> msi_set_affinit ()
> {
> ....
>         unlock_vector_lock();
>
>         /*
>          * Check whether the transition raced with a device interrupt and
>          * is pending in the local APICs IRR. It is safe to do this outside
>          * of vector lock as the irq_desc::lock of this interrupt is still
>          * held and interrupts are disabled: The check is not accessing the
>          * underlying vector store. It's just checking the local APIC's
>          * IRR.
>          */
>         if (lapic_vector_set_in_irr(cfg->vector))
>                 irq_data_get_irq_chip(irqd)->irq_retrigger(irqd);

No. This catches the transitional interrupt to the new vector on the
original CPU, i.e. the one which is running that code.

Again the steps are:

 1) Allocate new vector on new CPU

 2) Set new vector on original CPU

 3) Set new vector on new CPU

So we have 3 points where an interrupt can fire:

 A) Before #2

 B) After #2 and before #3

 C) After #3

#A is hitting the old vector which is still valid on the old CPU and
   will be handled once interrupts are enabled with the correct irq
   descriptor - Normal operation (same as with maskable MSI)

#B This must be checked in the IRR because the there is no valid vector
   on the old CPU.

#C is handled on the new vector on the new CPU

Thanks,

        tglx