lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 6 Jun 2019 11:53:05 +0200
From:   Marc Gonzalez <marc.w.gonzalez@...e.fr>
To:     Will Deacon <will.deacon@....com>,
        Mark Rutland <mark.rutland@....com>,
        Robin Murphy <robin.murphy@....com>,
        Marc Zyngier <marc.zyngier@....com>,
        Russell King <rmk+kernel@...linux.org.uk>
Cc:     Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Race between MMIO writes and level IRQs

Hello everyone,

There's something about interrupts I have never quite understood,
which I'd like to clear up once and for all. What I'm about to write
will probably sound trivial to anyone's who's already figured it out,
but I need to walk through it.

Consider a device, living on some peripheral bus, with an interrupt
line flowing from the device into some kind of interrupt controller.

I.e. there are two "communication channels"
1) the peripheral bus, and 2) the "out-of-band" interrupt line.

At some point, the device requires the CPU to do $SOMETHING. It sends
a signal over the interrupt line (either a pulse for edge interrupts,
or keeping the line high for level interrupts). After some time, the
CPU will "take the interrupt", mask all(?) interrupts, and jump to the
proper interrupt service routine (ISR).

The CPU does whatever it's supposed to do, and then needs to inform
the device that "yes, the work is done, stop pestering me". Typically,
this is done by writing some value to one of the device's registers.

AFAICT, this is the part where things can go wrong:

The CPU issues the magic MMIO write, which will take some time to reach
the device over the peripheral bus. Meanwhile, the device maintains the
IRQ signal (assuming a level interrupt). Once the CPU leaves the ISR, the
framework will unmask IRQs. If the write has not yet reached the device,
the CPU will be needlessly interrupted again.

Basically, there's a race between the MMIO write and the IRQ unmasking.
We'd like to be able to guarantee that the MMIO write is complete before
unmasking interrupts, right?

Some people use memory barriers, but my understanding is that this is
not sufficient. The memory barrier may guarantee that the MMIO write
has left the CPU "domain", but not that it has reached the device.

Am I mistaken?

So it looks like the only surefire way to guarantee that the MMIO write
has reached the device is to read the value back from the device?

Tangential: is this one of the issues solved by MSI?
https://en.wikipedia.org/wiki/Message_Signaled_Interrupts#Advantages

Regards.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ