linux-kernel - Re: Lost MSIs during hibernate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87a6cz39qd.ffs@tglx>
Date:   Tue, 05 Apr 2022 16:06:50 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Evan Green <evgreen@...gle.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Rajat Jain <rajatja@...omium.org>,
        Linux PM <linux-pm@...r.kernel.org>,
        linux-pci <linux-pci@...r.kernel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Mathias Nyman <mathias.nyman@...el.com>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Subject: Re: Lost MSIs during hibernate

Evan!

On Mon, Apr 04 2022 at 12:11, Evan Green wrote:
> To my surprise, I'm back with another MSI problem, and hoping to get
> some advice on how to approach fixing it.

Why am I not surprised?

> What worries me is those IRQ "no longer affine" messages, as well as
> my "EVAN don't touch hw" prints, indicating that requests to change
> the MSI are being dropped. These ignored requests are coming in when
> we try to migrate all IRQs off of the non-boot CPU, and they get
> ignored because all devices are "frozen" at this point, and presumably
> not in D0.

They are disabled at that point.

> To further try and prove that theory, I wrote a script to do the
> hibernate prepare image step in a loop, but messed with XHCI's IRQ
> affinity beforehand. If I move the IRQ to core 0, so far I have never
> seen a hang. But if I move it to another core, I can usually get a
> hang in the first attempt. I also very occasionally see wifi splats
> when trying this, and those "no longer affine" prints are all the wifi
> queue IRQs. So I think a wifi packet coming in at the wrong time can
> do the same thing.
>
> I wanted to see what thoughts you might have on this. Should I try to
> make a patch that moves all IRQs to CPU 0 *before* the devices all
> freeze? Sounds a little unpleasant. Or should PCI be doing something
> different to avoid this combination of "you're not allowed to modify
> my MSIs, but I might still generate interrupts that must not be lost"?

PCI cannot do much here and moving interrupts around is papering over
the underlying problem.

xhci_hcd 0000:00:0d.0: EVAN Write MSI 0 fee1e000 4023

  This sets up the interrupt when the driver is loaded

xhci_hcd 0000:00:14.0: EVAN Write MSI 0 fee01000 4024

  Ditto

xhci_hcd 0000:00:0d.0: calling pci_pm_freeze+0x0/0xad @ 423, parent: pci0000:00
xhci_hcd 0000:00:14.0: calling pci_pm_freeze+0x0/0xad @ 4644, parent: pci0000:00
xhci_hcd 0000:00:14.0: pci_pm_freeze+0x0/0xad returned 0 after 0 usecs
xhci_hcd 0000:00:0d.0: EVAN Write MSI 0 fee1e000 4023
xhci_hcd 0000:00:0d.0: pci_pm_freeze+0x0/0xad returned 0 after 196000 usecs

Those freeze() calls end up in xhci_suspend(), which tears down the XHCI
and ensures that no interrupts are on flight.

xhci_hcd 0000:00:0d.0: calling pci_pm_freeze_noirq+0x0/0xb2 @ 4645, parent: pci0000:00
xhci_hcd 0000:00:0d.0: pci_pm_freeze_noirq+0x0/0xb2 returned 0 after 30 usecs
xhci_hcd 0000:00:14.0: calling pci_pm_freeze_noirq+0x0/0xb2 @ 4644, parent: pci0000:00
xhci_hcd 0000:00:14.0: pci_pm_freeze_noirq+0x0/0xb2 returned 0 after 3118 usecs

   Now the devices are disabled and not accessible

xhci_hcd 0000:00:14.0: EVAN Don't touch hw 0 fee00000 4024
xhci_hcd 0000:00:0d.0: EVAN Don't touch hw 0 fee1e000 4045
xhci_hcd 0000:00:0d.0: EVAN Don't touch hw 0 fee00000 4045
xhci_hcd 0000:00:14.0: calling pci_pm_thaw_noirq+0x0/0x70 @ 9, parent: pci0000:00
xhci_hcd 0000:00:14.0: EVAN Write MSI 0 fee00000 4024

   This is the early restore _before_ the XHCI resume code is called
   This interrupt is targeted at CPU0 (it's the one which could not be
   written above).

xhci_hcd 0000:00:14.0: pci_pm_thaw_noirq+0x0/0x70 returned 0 after 5272 usecs
xhci_hcd 0000:00:0d.0: calling pci_pm_thaw_noirq+0x0/0x70 @ 1123, parent: pci0000:00
xhci_hcd 0000:00:0d.0: EVAN Write MSI 0 fee00000 4045

   Ditto

xhci_hcd 0000:00:0d.0: pci_pm_thaw_noirq+0x0/0x70 returned 0 after 623 usecs
xhci_hcd 0000:00:14.0: calling pci_pm_thaw+0x0/0x7c @ 3856, parent: pci0000:00
xhci_hcd 0000:00:14.0: pci_pm_thaw+0x0/0x7c returned 0 after 0 usecs
xhci_hcd 0000:00:0d.0: calling pci_pm_thaw+0x0/0x7c @ 4664, parent: pci0000:00
xhci_hcd 0000:00:0d.0: pci_pm_thaw+0x0/0x7c returned 0 after 0 usecs

That means the suspend/resume logic is doing the right thing.

How the XHCI ends up being confused here is a mystery. Cc'ed a few more folks.

Thanks,

        tglx