linux-kernel - Re: Linux 4.15-rc2: Regression in resume from ACPI S3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.20.1712061320090.1724@nanos>
Date:   Wed, 6 Dec 2017 13:23:34 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Michal Hocko <mhocko@...nel.org>
cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Andy Lutomirski <luto@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>
Subject: Re: Linux 4.15-rc2: Regression in resume from ACPI S3

On Wed, 6 Dec 2017, Michal Hocko wrote:
> merging tip/x86/urgent on top of your tree fixed this problem for me,
> but I am seeing something else
> [  131.711412] ACPI: Preparing to enter system sleep state S3
> [  131.755328] ACPI: EC: event blocked
> [  131.755328] ACPI: EC: EC stopped
> [  131.755328] PM: Saving platform NVS memory
> [  131.755344] Disabling non-boot CPUs ...
> [  131.779330] IRQ 124: no longer affine to CPU1
> [  131.780334] smpboot: CPU 1 is now offline
> [  131.804465] smpboot: CPU 2 is now offline
> [  131.827291] IRQ 122: no longer affine to CPU3
> [  131.827292] IRQ 123: no longer affine to CPU3
> [  131.828293] smpboot: CPU 3 is now offline
> [  131.830991] ACPI: Low-level resume complete
> [  131.831092] ACPI: EC: EC started
> [  131.831093] PM: Restoring platform NVS memory
> [  131.831864] do_IRQ: 0.55 No irq handler for vector

Hmm, that's really odd.

> [  131.831884] Enabling non-boot CPUs ...
> [  131.831909] x86: Booting SMP configuration:
> [  131.831910] smpboot: Booting Node 0 Processor 1 APIC 0x2
> [  131.832913]  cache: parent cpu1 should not be sleeping

This is an old one. 

> [  131.833058] CPU1 is up
> [  131.833067] smpboot: Booting Node 0 Processor 2 APIC 0x1
> [  131.833864]  cache: parent cpu2 should not be sleeping
> [  131.833983] CPU2 is up
> [  131.833995] smpboot: Booting Node 0 Processor 3 APIC 0x3
> [  131.834776]  cache: parent cpu3 should not be sleeping
> [  131.834923] CPU3 is up
> 
> "No irq handler" part looks a bit scary (maybe related to lost affinity
> messages?) but the following messages look quite as well. Is this
> something known? The system seems to be up and running without any
> visible issues.

I assume it's due to the affinity break, just that we don't know right now
on which CPU that do_IRQ() message triggered. I assume it's CPU0 because
the others are offline already, but ....

I'll think about it how we can figure out what's going on.

Thanks,

	tglx