lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200511190341.GA95413@otc-nc-03>
Date:   Mon, 11 May 2020 12:03:41 -0700
From:   "Raj, Ashok" <ashok.raj@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     "Raj, Ashok" <ashok.raj@...ux.intel.com>,
        Evan Green <evgreen@...omium.org>,
        Mathias Nyman <mathias.nyman@...ux.intel.com>, x86@...nel.org,
        linux-pci <linux-pci@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        "Ghorai, Sukumar" <sukumar.ghorai@...el.com>,
        "Amara, Madhusudanarao" <madhusudanarao.amara@...el.com>,
        "Nandamuri, Srikanth" <srikanth.nandamuri@...el.com>,
        Ashok Raj <ashok.raj@...el.com>
Subject: Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug

Hi Thomas, 

On Fri, May 08, 2020 at 06:49:15PM +0200, Thomas Gleixner wrote:
> Ashok,
> 
> "Raj, Ashok" <ashok.raj@...el.com> writes:
> > With legacy MSI we can have these races and kernel is trying to do the
> > song and dance, but we see this happening even when IR is turned on.
> > Which is perplexing. I think when we have IR, once we do the change vector 
> > and flush the interrupt entry cache, if there was an outstandng one in 
> > flight it should be in IRR. Possibly should be clearned up by the
> > send_cleanup_vector() i suppose.
> 
> Ouch. With IR this really should never happen and yes the old vector
> will catch one which was raised just before the migration disabled the
> IR entry. During the change nothing can go wrong because the entry is
> disabled and only reenabled after it's flushed which will send a pending
> one to the new vector.

with IR, I'm not sure if we actually mask the interrupt except when
its a Posted Interrupt. 

We do an atomic update to IRTE, with cmpxchg_double

	ret = cmpxchg_double(&irte->low, &irte->high,
			     irte->low, irte->high,
			     irte_modified->low, irte_modified->high);

followed by flushing the interrupt entry cache. After which any 
old ones in flight before the flush should be sittig in IRR
on the outgoing cpu.

The send_cleanup_vector() sends IPI to the apic_id->old_cpu which 
would be the cpu we are running on correct? and this is a self_ipi
to IRQ_MOVE_CLEANUP_VECTOR.

smp_irq_move_cleanup_interrupt() seems to check IRR with 
apicid_prev_vector()

	irr = apic_read(APIC_IRR + (vector / 32 * 0x10));
	if (irr & (1U << (vector % 32))) {
		apic->send_IPI_self(IRQ_MOVE_CLEANUP_VECTOR);
		continue;
	}

And this would allow any pending IRR bits in the outgoing CPU to 
call the relevant ISR's before draining all vectors on the outgoing
CPU. 

Does it sound right?

I couldn't quite pin down how the device ISR's are hooked up through
this send_cleanup_vector() and what follows.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ