lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871rrqva0t.fsf@nanos.tec.linutronix.de>
Date:   Thu, 23 Jan 2020 09:42:42 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Evan Green <evgreen@...omium.org>
Cc:     Bjorn Helgaas <helgaas@...nel.org>, linux-pci@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>,
        Marc Zyngier <maz@...nel.org>, Christoph Hellwig <hch@....de>,
        Rajat Jain <rajatxjain@...il.com>
Subject: Re: [PATCH] PCI/MSI: Avoid torn updates to MSI pairs

Evan Green <evgreen@...omium.org> writes:
> On Wed, Jan 22, 2020 at 3:37 PM Thomas Gleixner <tglx@...utronix.de> wrote:
>> > One other way you could avoid torn MSI writes would be to ensure that
>> > if you migrate IRQs across cores, you keep the same x86 vector number.
>> > That way the address portion would be updated, and data doesn't
>> > change, so there's no window. But that may not actually be feasible.
>>
>> That's not possible simply because the x86 vector space is limited. If
>> we would have to guarantee that then we'd end up with a max of ~220
>> interrupts per system. Sufficient for your notebook, but the big iron
>> people would be not amused.
>
> Right, that occurred to me as well. The actual requirement isn't quite
> as restrictive. What you really need is the old vector to be
> registered on both the old CPU and the new CPU. Then once the
> interrupt is confirmed to have moved we could release both the old
> vector both CPUs, leaving only the new vector on the new CPU.

Sure, and how can you guarantee that without reserving the vector on all
CPUs in the first place? If you don't do that then if the vector is not
available affinity setting would fail every so often and it would pretty
much prevent hotplug if a to be migrated vector is not available on at
least one online CPU.

> In that world some SMP affinity transitions might fail, which is a
> bummer. To avoid that, you could first migrate to a vector that's
> available on both the source and destination CPUs, keeping affinity
> the same. Then change affinity in a separate step.

Good luck with doing that at the end of the hotplug routine where the
CPU is about to vanish.

> Or alternatively, you could permanently designate a "transit" vector.
> If an interrupt fires on this vector, then we call all ISRs currently
> in transit between CPUs. You might end up calling ISRs that didn't
> actually need service, but at least that's better than missing edges.

I don't think we need that. While walking the dogs I thought about
invoking a force migrated interrupt on the target CPU, but haven't
thought it through yet.

>> 'lscpci -vvv' and 'cat /proc/interrupts'
>
> Here it is:
> https://pastebin.com/YyxBUvQ2

Hrm:

        Capabilities: [80] MSI-X: Enable+ Count=16 Masked-

So this is weird. We mask it before moving it, so the tear issue should
not happen on MSI-X. So the tearing might be just a red herring.

Let me stare into the code a bit.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ