lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 6 Feb 2017 15:33:54 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Disabling msix interrupts

netdev probably isn't the right list for this, but I suspect people
reading it understand what happens.

I'm fairly sure that an msix interrupt can get raised after
the kernel thinks it has masked it.

When an msix interrupt is disabled I think msi_set_mask_bit()
(in drivers/pci/msi.c) is called to write a '1' to the card's
hardware MSIX mask register (the last 32bit word of the entry).
This function carefully reads back the mask register to flush
the write through the pcie bus.
Except it doesn't, it reads the 'address_lo' register instead! [1]

While this will stop the hardware raising any more interrupts,
it could easily be in the process of raising one.
ie have read the mask, found it zero, read the address and
data, and be in the process of issuing the pcie write.

The pcie write (to disable the interrupt) and readback are seen
by the hardware as (more or less) back to back transfers, so can
both easily overtake the request to raise the interrupt.
The pcie bus is also allowed to make a read completion tlp
overtake a write tlp.
Add in any host-side delays in raising the hardware interrupt
itself, and an interrupt could happen well after it was masked.

More worrying would be any code that tries to change the address
and data associated with an interrupt.
You'd need moderate guard times after the disable and before the
enable to ensure the hardware didn't raise an interrupt with
a mismatch of the old and new values.

[1] Maybe I'll look at the order those cycles actually arrive in.

	David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ