lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <trinity-b4acae19-b828-49d5-9e31-72d25d62f9bc-1769154279824@3c-app-mailcom-bs09>
Date: Fri, 23 Jan 2026 08:44:39 +0100
From: ysard <ysard_git@....fr>
To: John Ogness <john.ogness@...utronix.de>
Cc: linux-kernel@...r.kernel.org, pmladek@...e.com, senozhatsky@...omium.org
Subject: Re: Regression: system freeze on resume from suspend introduced by
 printk per-console suspended state

Good evening, thank you for your reply and the patch.


Summary
======

The patch does not seem to have any effect on the problem, *but* I have found a
way to temporarily fix the freeze by disabling the `nvidia-suspend` service.


Additional info for diagnostics
===============================

$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  470.256.02  Thu May  2 14:37:44 UTC 2024
GCC version:  gcc version 12.5.0 (Debian 12.5.0-6)

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89


Procedure requested
===================

> I have attached a patch (based on 6.19-rc4). It should restore the old
> console_lock behavior during suspend/resume. Assuming this works for
> you, it also adds some debugging information so that we can figure out
> who is locking the console.

I applied the patch. The behavior is the same as before (no resume).

$ uname -r
6.19.0-rc4-dirty

$ dmesg | grep printk
[    0.030102] [      T0] printk: log buffer data + meta data: 131072 + 458752 = 589824 bytes
[    0.077779] [      T0] printk: legacy console [tty0] enabled
[  152.678589] [   T1349] printk: Suspending console(s) (use no_console_suspend to debug)
...
no resume


Temporary solution
==================

I had the idea of restarting in recovery mode (rescue.target) to run the test.
The `systemctl suspend` command is not available in this mode, which forced me
to use the `pm-suspend` command, which allows for proper sleep and resume across
all kernel versions that I have been able to test previously.

Systemd triggers a number of services before actually going into sleep mode,
including a call to nvidia-suspend.service, which I disabled
("because it's always nvidia").

The following command restores normal operation of `systemctl suspend`,
including on the first non-functional commit found by the bisect
(9e70a5e109a4a23367810de09be826c52d27ee2f).

$ systemctl disable nvidia-suspend.service

This service calls a script `/usr/bin/nvidia-sleep.sh` that seems to play with
vt consoles and expects that they are still usable (`chvt 63` ?):

    #!/bin/bash

    if [ ! -f /proc/driver/nvidia/suspend ]; then
        exit 0
    fi

    RUN_DIR="/var/run/nvidia-sleep"
    XORG_VT_FILE="${RUN_DIR}"/Xorg.vt_number

    PATH="/bin:/usr/bin"

    case "$1" in
        suspend|hibernate)
            mkdir -p "${RUN_DIR}"
            fgconsole > "${XORG_VT_FILE}"
            chvt 63
            if [[ $? -ne 0 ]]; then
                exit $?
            fi
            echo "$1" > /proc/driver/nvidia/suspend
            exit $?
            ;;
        resume)
            echo "$1" > /proc/driver/nvidia/suspend
            #
            # Check if Xorg was determined to be running at the time
            # of suspend, and whether its VT was recorded.  If so,
            # attempt to switch back to this VT.
            #
            if [[ -f "${XORG_VT_FILE}" ]]; then
                XORG_PID=$(cat "${XORG_VT_FILE}")
                rm "${XORG_VT_FILE}"
                chvt "${XORG_PID}"
            fi
            exit 0
            ;;
        *)
            exit 1
    esac


Conclusion
==========

kernel              nvidia-suspend (systemd 259~rc1-1)  result
<  9e70a5e109a4     enabled                             ok
<  9e70a5e109a4     disabled                            ok
>= 9e70a5e109a4     enabled                             freeze
>= 9e70a5e109a4     disabled                            ok

- Reactivating this service causes the freeze to reappear in a reproducible pattern.
- The `pm-suspend` command has never stopped working.

It seems that this is a two-sided problem?
If the kernel is not the issue, I apologize and am sorry for wasting your time;
I should have thought about the layers added by systemd sooner.

Best regards.


Extra
=====

During my tests with 6.19.0-rc1 and 6.19.0-rc4, I noticed that resuming a sleep
test that used to work now fails (it worked in 6.18.2), but I think this is
unrelated and is due to another issue. I am noting this for historical purposes.

$ echo core > /sys/power/pm_test
$ echo deep > /sys/power/mem_sleep

Both commands `pm-suspend` or `systemctl suspend` have the same effect:

- Trigger suspend (`kernel: PM: suspend entry (deep)` in dmesg);
- No response when pressing the power button to wake up;
- Force shutdown by holding down the power button;
- The computer shuts down but the motherboard indicates a state similar to
 sleep mode (LED flashing);
- Pressing the power button starts the computer (fans + HDD spin up) for a
 fraction of a second (<1s) then the machine shuts down;
- Pressing the power button starts the machine normally
 (not a resume from sleep mode).


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ