lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPDyKFq55Vqfd7cMdmQZBzvS1Xr-Z4QaTzEeuWWn3EX4HBbP3A@mail.gmail.com>
Date: Tue, 9 Dec 2025 16:08:59 +0100
From: Ulf Hansson <ulf.hansson@...aro.org>
To: Tabby Kitten <nyanpasu256@...il.com>
Cc: linux-mmc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled

Hi,

On Wed, 26 Nov 2025 at 10:08, Tabby Kitten <nyanpasu256@...il.com> wrote:
>
> On a PC with a Realtek PCI Express SD reader, when you sleep with
> `wakeup_count` active (eg. sleeping from KDE's lock screen), the MMC
> driver wakes up the system and aborts suspend.

Okay, that's clearly a problem that needs to be fixed!

>
> I've found a sleep failure bug in the rtsx_pci and mmc_core drivers.
> After userspace writes a number to `/sys/power/wakeup_count` (eg. KDE
> Plasma does it to distinguish user wakes from timers and Wake-on-LAN),
> if it attempts a mem suspend it will be aborted when
> rtsx_pci_runtime_resume() -> mmc_detect_change() emits a
> pm_wakeup_ws_event(). This breaks sleep on some hardware and desktop
> environments.
>
> The detailed description:
> The recently released Plasma 6.5.0 writes to `/sys/power/wakeup_count`
> before sleeping. On my computer this caused the sleep attempt to fail
> with dmesg error "PM: Some devices failed to suspend, or early wake
> event detected". I got this error on both Arch Linux and Fedora, and
> replicated it on Fedora with the mainline kernel COPR. KDE is tracking
> this error at https://bugs.kde.org/show_bug.cgi?id=510992, and have
> disabled writing to wakeup_count on Plasma 6.5.3 to work around this
> issue.
>
> I've written a standalone shell script to reproduce this sleep failure
> (save as badsleep.sh):
>
> #!/bin/bash
> read wakeup_count < /sys/power/wakeup_count
> if [[ $? -ne 0 ]]; then
>     e=$?
>     echo "Failed to open wakeup_count, suspend maybe already in progress"
>     exit $e
> fi
> echo $wakeup_count > /sys/power/wakeup_count
> if [[ $? -ne 0 ]]; then
>     e=$?
>     echo "Failed to write wakeup_count, wakeup_count may have changed in between"
>     exit $e
> fi
> echo mem > /sys/power/state
>
> Running `sudo ./badsleep.sh` reproduces failed sleeps on my computer.
> (sudo is needed to write to `/sys/power/wakeup_count` on Fedora.)
>
> * If I run the script unaltered, the screen turns off and on, and the
>   terminal outputs
>   `./badsleep.sh: line 14: echo: write error: Device or resource busy`
>   indicating the mem sleep failed.
>
> * If I edit the script and comment out `echo $wakeup_count >
>   /sys/power/wakeup_count`, the sleep succeeds, and waking the computer
>   skips the lock screen and resumes where I left off.
>
> * If I run `sudo rmmod rtsx_pci_sdmmc` to disable the faulty module, the
>   sleep succeeds, and waking the computer skips the lock screen and
>   resumes where I left off.
>
> I think this problem happens in general when a driver spawns a wakeup
> event from its suspend callback. On my system, the driver in question
> lies in the MMC subsystem.
>
> ## Code debugging
>
> If I run `echo 1 > /sys/power/pm_debug_messages` to enable verbose
> logging, then attempt a failed sleep, I see output:
>
>     PM: Wakeup pending, aborting suspend
>     PM: active wakeup source: mmc0
>     PM: suspend of devices aborted after 151.615 msecs
>     PM: start suspend of devices aborted after 169.797 msecs
>     PM: Some devices failed to suspend, or early wake event detected
>
> The "Wakeup pending, aborting suspend" message comes from function
> `pm_wakeup_pending()`. This function checks if event checks are enabled,
> and if some counters have changed aborts suspend and calls
> `pm_print_active_wakeup_sources()`, which prints `wakeup_sources`.
> Tracing the code that modifies `wakeup_sources`, I found that
> `pm_wakeup_ws_event()` would activate an event and
> `wakeup_source_register() → wakeup_source_add()` would add a new one.

Thanks for all the details!

>
> To find who changed wakeup events, I used my stacksnoop fork at
> https://github.com/nyanpasu64/bcc/blob/local/examples/tracing/stacksnoop
> .py to trace a failed suspend:
>
> nyanpasu64@...en ~/code/bcc (local)> sudo ./examples/tracing/stacksnoop.py pm_wakeup_ws_event wakeup_source_register
> TIME(s)            FUNCTION
> 7.254676819:
> 0: ret_from_fork_asm [kernel]
> 1: ret_from_fork [kernel]
> 2: kthread [kernel]
> 3: worker_thread [kernel]
> 4: process_one_work [kernel]
> 5: async_run_entry_fn [kernel]
> 6: async_suspend [kernel]
> 7: device_suspend [kernel]
> 8: dpm_run_callback [kernel]
> 9: mmc_bus_suspend [mmc_core]
> 10: mmc_blk_suspend [mmc_block]
> 11: mmc_queue_suspend [mmc_block]
> 12: __mmc_claim_host [mmc_core]
> 13: __pm_runtime_resume [kernel]
> 14: rpm_resume [kernel]
> 15: rpm_resume [kernel]
> 16: rpm_callback [kernel]
> 17: __rpm_callback [kernel]
> 18: rtsx_pci_runtime_resume [rtsx_pci]
> 19: mmc_detect_change [mmc_core]
> 20: pm_wakeup_ws_event [kernel]
>
> On a previous kernel, lines 9-12 were replaced by a single call to
> `pci_pm_suspend`. I've posted my detailed debugging on the older kernel
> at https://bugs.kde.org/show_bug.cgi?id=510992#c26. There I found that
> `pci_pm_suspend()` wakes PCI(e) devices before sending them into a full
> sleep state, but in the process, `_mmc_detect_change()` will "Prevent
> system sleep for 5s to allow user space to consume the\n corresponding
> uevent"... which interrupts a system sleep in progress.
>
> On my current kernel, the same logic applies, but reading the source I
> can't tell where `__mmc_claim_host()` is actually calling
> `__pm_runtime_resume()`. Nonetheless the problem remains that
> `rpm_resume()` is called during system suspend, `mmc_detect_change()`
> wakes the system when called, and this will abort system sleep when
> `/sys/power/wakeup_count` is active.

__mmc_claim_host() will call pm_runtime_get_sync() to runtime resume
the mmc host device.

The mmc host device's parent (a pci device) will then be runtime
resumed too. That's the call to rtsx_pci_runtime_resume() we see
above.

The problem is then that rtsx_pci_runtime_resume() invokes a callback
(->card_event())) back into the mmc host driver
(drivers/mmc/host/rtsx_pci_sdmmc.c), which ends up calling
mmc_detect_change() to try to detect whether a card have been
inserted/removed.

>
> ## Next steps
>
> How would this problem be addressed? Off the top of my head, perhaps you
> could not call `__pm_runtime_resume()` on a SD card reader during the
> `device_suspend()` process, not call `pm_wakeup_ws_event()` when the SD
> card status changes, not call  `pm_wakeup_ws_event()` *specifically*
> when system suspend is temporarily waking up a SD card reader, or
> disable pm_wakeup_ws_event() entirely during the suspend process (does
> this defeat the purpose of the function?).

Let me think a bit on what makes the best sense here. I will get back
to you in a couple of days.

>
> Are there other drivers which cause the same symptoms? I don't know. I
> asked on the KDE bug tracker for other users to attempt a failed sleep
> with `echo 1 > /sys/power/pm_debug_messages` active, to identify which
> driver broke suspend in their system; so far nobody has replied with
> logs.
>
> Given that this bug is related to `/sys/power/wakeup_count`
> (https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-power), I
> was considering CCing Rafael J. Wysocki <rafael@...nel.org> and
> linux-pm@...r.kernel.org, but have decided to only message the MMC
> maintainers for now. If necessary we may have to forward this message
> there to get their attention.
>
> ----
>
> System information:
>
> * I have an Intel NUC8i7BEH mini PC, with CPU 8 × Intel® Core™ i7-8559U
>   CPU @ 2.70GHz.
>
>     * uname -mi prints `x86_64 unknown`.
>
> * `lspci -nn` prints
>   "6e:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader [10ec:522a] (rev 01)".
>
> * I am running kernel 6.18.0-0.rc7.357.vanilla.fc43.x86_64 from the Fedora COPRs
>   (https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories).
>
> * dmesg at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-dmesg-2025-11-25-txt
>
> * Fully resolved config at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-config-6-18-0-0-rc7-357-vanilla-fc43-x86_64,
>   source at https://download.copr.fedorainfracloud.org/results/@kernel-vanilla/mainline-wo-mergew/fedora-43-x86_64/09831015-mainline-womergew-releases/kernel-6.18.0-0.rc7.357.vanilla.fc43.src.rpm

Kind regards
Uffe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ