lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 13 Dec 2017 10:23:37 -0600
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Maarten Lankhorst <dev@...ankhorst.nl>,
        Michal Hocko <mhocko@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Andy Lutomirski <luto@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Daniel Vetter <daniel.vetter@...el.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        linux-pci@...r.kernel.org, linux-pm@...r.kernel.org
Subject: Re: Linux 4.15-rc2: Regression in resume from ACPI S3

[+cc linux-pci, linux-pm]

On Wed, Dec 13, 2017 at 04:57:56PM +0100, Thomas Gleixner wrote:
> So I was finally able to figure out what the hell is going on:
> 
> Suspend:
> 
>  - The device suspend code puts the graphics card into a power
>    state != PCI_D0.
> 
>  - Offline non boot CPUs
> 
>  - Break interrupt affinity. Allocate new vector on CPU 0, compose and
>    write MSI message which ends up in:
> 
>    __pci_write_msi_msg(entry, msg)
>    {
> 	if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev)) {
> 	   /* Don't touch the hardware now */
> 	} else {
> 	   ....
> 	}
> 	entry->msg = *msg;
>    }
>  
>   So because the device is not in PCI_D0 the message is not written. It's
>   written in the device resume path.

I'm not a PM guru, but this ordering seems fragile.  If we offline
CPUs before re-targeting interrupts directed at those CPUs, aren't we
always going to be at risk of sending interrupts to an offline CPU?

Even if the device is now asleep and therefore should not generate an
interrupt, it seems like there's a window when the device returns to
PCI_D0 where it could generate an interrupt before we have a chance to
update the MSI message.

> Resume:
> [  139.670446] ACPI: Low-level resume complete
> [  139.670541] PM: Restoring platform NVS memory
> [  139.672462] do_IRQ: 0.55 No irq handler for vector
> [  139.672475] Enabling non-boot CPUs ...
> 
> So the spurious interrupt happens early and way before the device resume
> code writes the new MSI message.
> 
> I checked the behaviour on 4.14. The MSI write is delayed there in the same
> way, but there is no spurious interrupt. There is no interrupt coming in at
> all _BEFORE_ the device is put out of PCI_D0.
> 
> And this has certainly nothing to do with the vector management changes,
> but I can't figure yet what makes that spurious interrupt to be sent.
> 
> Any ideas welcome.
> 
> Thanks,
> 
> 	tglx
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ