[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <trinity-b4acae19-b828-49d5-9e31-72d25d62f9bc-1769154279824@3c-app-mailcom-bs09>
Date: Fri, 23 Jan 2026 08:44:39 +0100
From: ysard <ysard_git@....fr>
To: John Ogness <john.ogness@...utronix.de>
Cc: linux-kernel@...r.kernel.org, pmladek@...e.com, senozhatsky@...omium.org
Subject: Re: Regression: system freeze on resume from suspend introduced by
printk per-console suspended state
Good evening, thank you for your reply and the patch.
Summary
======
The patch does not seem to have any effect on the problem, *but* I have found a
way to temporarily fix the freeze by disabling the `nvidia-suspend` service.
Additional info for diagnostics
===============================
$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 470.256.02 Thu May 2 14:37:44 UTC 2024
GCC version: gcc version 12.5.0 (Debian 12.5.0-6)
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
Procedure requested
===================
> I have attached a patch (based on 6.19-rc4). It should restore the old
> console_lock behavior during suspend/resume. Assuming this works for
> you, it also adds some debugging information so that we can figure out
> who is locking the console.
I applied the patch. The behavior is the same as before (no resume).
$ uname -r
6.19.0-rc4-dirty
$ dmesg | grep printk
[ 0.030102] [ T0] printk: log buffer data + meta data: 131072 + 458752 = 589824 bytes
[ 0.077779] [ T0] printk: legacy console [tty0] enabled
[ 152.678589] [ T1349] printk: Suspending console(s) (use no_console_suspend to debug)
...
no resume
Temporary solution
==================
I had the idea of restarting in recovery mode (rescue.target) to run the test.
The `systemctl suspend` command is not available in this mode, which forced me
to use the `pm-suspend` command, which allows for proper sleep and resume across
all kernel versions that I have been able to test previously.
Systemd triggers a number of services before actually going into sleep mode,
including a call to nvidia-suspend.service, which I disabled
("because it's always nvidia").
The following command restores normal operation of `systemctl suspend`,
including on the first non-functional commit found by the bisect
(9e70a5e109a4a23367810de09be826c52d27ee2f).
$ systemctl disable nvidia-suspend.service
This service calls a script `/usr/bin/nvidia-sleep.sh` that seems to play with
vt consoles and expects that they are still usable (`chvt 63` ?):
#!/bin/bash
if [ ! -f /proc/driver/nvidia/suspend ]; then
exit 0
fi
RUN_DIR="/var/run/nvidia-sleep"
XORG_VT_FILE="${RUN_DIR}"/Xorg.vt_number
PATH="/bin:/usr/bin"
case "$1" in
suspend|hibernate)
mkdir -p "${RUN_DIR}"
fgconsole > "${XORG_VT_FILE}"
chvt 63
if [[ $? -ne 0 ]]; then
exit $?
fi
echo "$1" > /proc/driver/nvidia/suspend
exit $?
;;
resume)
echo "$1" > /proc/driver/nvidia/suspend
#
# Check if Xorg was determined to be running at the time
# of suspend, and whether its VT was recorded. If so,
# attempt to switch back to this VT.
#
if [[ -f "${XORG_VT_FILE}" ]]; then
XORG_PID=$(cat "${XORG_VT_FILE}")
rm "${XORG_VT_FILE}"
chvt "${XORG_PID}"
fi
exit 0
;;
*)
exit 1
esac
Conclusion
==========
kernel nvidia-suspend (systemd 259~rc1-1) result
< 9e70a5e109a4 enabled ok
< 9e70a5e109a4 disabled ok
>= 9e70a5e109a4 enabled freeze
>= 9e70a5e109a4 disabled ok
- Reactivating this service causes the freeze to reappear in a reproducible pattern.
- The `pm-suspend` command has never stopped working.
It seems that this is a two-sided problem?
If the kernel is not the issue, I apologize and am sorry for wasting your time;
I should have thought about the layers added by systemd sooner.
Best regards.
Extra
=====
During my tests with 6.19.0-rc1 and 6.19.0-rc4, I noticed that resuming a sleep
test that used to work now fails (it worked in 6.18.2), but I think this is
unrelated and is due to another issue. I am noting this for historical purposes.
$ echo core > /sys/power/pm_test
$ echo deep > /sys/power/mem_sleep
Both commands `pm-suspend` or `systemctl suspend` have the same effect:
- Trigger suspend (`kernel: PM: suspend entry (deep)` in dmesg);
- No response when pressing the power button to wake up;
- Force shutdown by holding down the power button;
- The computer shuts down but the motherboard indicates a state similar to
sleep mode (LED flashing);
- Pressing the power button starts the computer (fans + HDD spin up) for a
fraction of a second (<1s) then the machine shuts down;
- Pressing the power button starts the machine normally
(not a resume from sleep mode).
Powered by blists - more mailing lists