[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1712131507160.1885@nanos>
Date: Wed, 13 Dec 2017 16:57:56 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Maarten Lankhorst <dev@...ankhorst.nl>
cc: Michal Hocko <mhocko@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"Rafael J. Wysocki" <rjw@...ysocki.net>,
Andy Lutomirski <luto@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>,
Daniel Vetter <daniel.vetter@...el.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Subject: Re: Linux 4.15-rc2: Regression in resume from ACPI S3
So I was finally able to figure out what the hell is going on:
Suspend:
- The device suspend code puts the graphics card into a power
state != PCI_D0.
- Offline non boot CPUs
- Break interrupt affinity. Allocate new vector on CPU 0, compose and
write MSI message which ends up in:
__pci_write_msi_msg(entry, msg)
{
if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev)) {
/* Don't touch the hardware now */
} else {
....
}
entry->msg = *msg;
}
So because the device is not in PCI_D0 the message is not written. It's
written in the device resume path.
Resume:
[ 139.670446] ACPI: Low-level resume complete
[ 139.670541] PM: Restoring platform NVS memory
[ 139.672462] do_IRQ: 0.55 No irq handler for vector
[ 139.672475] Enabling non-boot CPUs ...
So the spurious interrupt happens early and way before the device resume
code writes the new MSI message.
I checked the behaviour on 4.14. The MSI write is delayed there in the same
way, but there is no spurious interrupt. There is no interrupt coming in at
all _BEFORE_ the device is put out of PCI_D0.
And this has certainly nothing to do with the vector management changes,
but I can't figure yet what makes that spurious interrupt to be sent.
Any ideas welcome.
Thanks,
tglx
Powered by blists - more mailing lists