[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXzUxjNz1ysm-Iih@pathway.suse.cz>
Date: Fri, 30 Jan 2026 16:56:54 +0100
From: Petr Mladek <pmladek@...e.com>
To: ysard <ysard_git@....fr>
Cc: John Ogness <john.ogness@...utronix.de>, linux-kernel@...r.kernel.org,
senozhatsky@...omium.org
Subject: Re: Regression: system freeze on resume from suspend introduced by
printk per-console suspended state
On Thu 2026-01-29 10:34:53, ysard wrote:
> Summary
> =======
>
> The patch works only when I *uncomment* the 2 `synchronize_srcu(&console_srcu);` lines!
It is a pretty interesting information. It is good to know that
the synchronize_scru() is needed and has a positive effect.
> With synchronize_srcu commented (as is)
> =======================================
>
> No freeze (expected):
> $ sudo sh -c "
> mkdir -p /var/run/nvidia-sleep \
> && echo 2 > /var/run/nvidia-sleep/Xorg.vt_number \
> && chvt 63 \
> && systemctl suspend"
>
> Logs:
> [ 378.634960] [ T2805] printk: Suspending console(s) (use no_console_suspend to debug)
> [ 378.634961] [ T2805] printk: console_suspend_all
> [ 378.669076] [ T2831] printk: console_trylock
> [ 378.669385] [ T271] printk: console_trylock
> [ 378.677075] [ T2826] printk: console_trylock
> [ 378.677094] [ T2832] printk: console_trylock
> [ 378.677597] [ T269] printk: console_trylock
> [ 378.678842] [ T273] printk: console_trylock
> [ 378.820962] [ T100] printk: console_trylock
> [ 379.498727] [ T2805] printk: console_trylock
> [ 379.537391] [ T2805] printk: console_trylock
> [ 379.650475] [ T2805] printk: console_trylock
> [ 379.650477] [ T2805] printk: console_trylock
> [ 379.650478] [ T2805] printk: console_trylock
> [ 379.650618] [ T2805] printk: console_trylock
> [ 379.652295] [ T2805] printk: console_trylock
> [ 379.654336] [ T2805] printk: console_trylock
> [ 379.656435] [ T2805] printk: console_trylock
> [ 379.658727] [ T2805] printk: console_trylock
> [ 379.660623] [ T2805] printk: console_trylock
> [ 379.662429] [ T2805] printk: console_trylock
> [ 379.664265] [ T2805] printk: console_trylock
> [ 379.666130] [ T2805] printk: console_trylock
> [ 379.666154] [ T2805] printk: console_trylock
> [ 379.666156] [ T2805] printk: console_trylock
> [ 379.666607] [ T2805] printk: console_trylock
> [ 379.666654] [ T2805] printk: console_trylock
> [ 379.670212] [ T2805] printk: console_trylock
> [ 379.670243] [ T2805] printk: console_trylock
> [ 379.673828] [ T2805] printk: console_trylock
> [ 379.673863] [ T2805] printk: console_trylock
> [ 379.677518] [ T2805] printk: console_trylock
> [ 379.677547] [ T2805] printk: console_trylock
> [ 379.680645] [ T2805] printk: console_trylock
> [ 379.680675] [ T2805] printk: console_trylock
> [ 379.683811] [ T2805] printk: console_trylock
> [ 379.683836] [ T2805] printk: console_trylock
> [ 379.686960] [ T2805] printk: console_trylock
> [ 379.686984] [ T2805] printk: console_trylock
> [ 379.690141] [ T2805] printk: console_trylock
> [ 379.693604] [ T2805] printk: console_trylock
> [ 379.753022] [ T2805] printk: console_trylock
> [ 379.754384] [ T2805] printk: console_trylock
> [ 379.754386] [ T2831] printk: console_trylock
> [ 379.754392] [ T2831] printk: console_trylock
> [ 379.754395] [ T2831] printk: console_trylock
> [ 379.754421] [ T2825] printk: console_trylock
> [ 379.754466] [ T2829] printk: console_trylock
> [ 379.754563] [ T2826] printk: console_trylock
> [ 379.766769] [ T11] printk: console_lock
> [ 379.766772] [ T11] printk: console_unlock
> [ 380.030746] [ T2826] printk: console_trylock
> [ 380.030749] [ T2825] printk: console_trylock
> [ 380.072183] [ T11] printk: console_trylock
> [ 380.081075] [ T271] printk: console_trylock
> [ 380.081100] [ T273] printk: console_trylock
> [ 380.081734] [ T273] printk: console_trylock
> [ 380.082280] [ T271] printk: console_trylock
> [ 380.083257] [ T271] printk: console_trylock
> [ 380.083767] [ T273] printk: console_trylock
> [ 380.084464] [ T271] printk: console_trylock
> [ 380.086088] [ T273] printk: console_trylock
> [ 380.091983] [ T334] printk: console_trylock
> [ 380.094504] [ T271] printk: console_trylock
> [ 380.095494] [ T271] printk: console_trylock
> [ 380.096698] [ T271] printk: console_trylock
> [ 380.098628] [ T11] printk: console_trylock
> [ 380.099849] [ T84] printk: console_trylock
> [ 380.104195] [ T271] printk: console_trylock
> [ 380.104493] [ T273] printk: console_trylock
> [ 380.106533] [ T273] printk: console_trylock
> [ 380.108797] [ T273] printk: console_trylock
> [ 380.122616] [ T273] printk: console_trylock
> [ 380.158723] [ T2831] printk: console_trylock
> [ 380.470807] [ T2825] printk: console_trylock
> [ 380.652605] [ T2805] printk: console_resume_all
>
> Proc comm values:
> 100: kworker/5:1-mm_percpu_wq
> 11: kworker/0:1-events
> 269: scsi_eh_0
> 271: scsi_eh_1
> 273: scsi_eh_2
> 2805: not exists or not readable
> 2825: not exists or not readable
> 2826: not exists or not readable
> 2829: not exists or not readable
> 2831: kworker/u32:17-flush-253:1
> 2832: not exists or not readable
> 334: not exists or not readable
> 76: kauditd
> 84: kworker/2:1-events
>
> ---
>
> Freeze (as before):
> $ sudo sh -c "
> mkdir -p /var/run/nvidia-sleep \
> && echo 2 > /var/run/nvidia-sleep/Xorg.vt_number \
> && chvt 63 \
> && echo suspend >/proc/driver/nvidia/suspend \
> && systemctl suspend"
Note that the disabled synchronize_srcu() might cause another
problems (races) which are not related to the nvidia driver.
It is possible that it failed from other reasons this time.
Anyway, it is great to know that the console_srcu()
could and actually should stay.
> With synchronize_srcu uncommented
> =================================
>
> No freeze (expected):
> $ sudo sh -c "
> mkdir -p /var/run/nvidia-sleep \
> && echo 2 > /var/run/nvidia-sleep/Xorg.vt_number \
> && chvt 63 \
> && systemctl suspend"
>
> Logs:
> [ 97.006391] [ T2643] printk: Suspending console(s) (use no_console_suspend to debug)
> [ 97.006393] [ T2643] printk: console_suspend_all
> [ 97.043073] [ T2654] printk: console_trylock
> [ 97.043290] [ T2655] printk: console_trylock
> [ 97.043486] [ T270] printk: console_trylock
> [ 97.044873] [ T272] printk: console_trylock
> [ 97.067023] [ T2656] printk: console_trylock
> [ 97.067403] [ T268] printk: console_trylock
> [ 97.138911] [ T9] printk: console_trylock
> [ 97.853561] [ T2643] printk: console_trylock
> [ 97.891353] [ T2643] printk: console_trylock
> [ 98.003694] [ T2643] printk: console_trylock
> [ 98.003696] [ T2643] printk: console_trylock
> [ 98.003697] [ T2643] printk: console_trylock
> [ 98.003838] [ T2643] printk: console_trylock
> [ 98.005787] [ T2643] printk: console_trylock
> [ 98.008142] [ T2643] printk: console_trylock
> [ 98.010180] [ T2643] printk: console_trylock
> [ 98.012412] [ T2643] printk: console_trylock
> [ 98.014319] [ T2643] printk: console_trylock
> [ 98.016306] [ T2643] printk: console_trylock
> [ 98.018004] [ T2643] printk: console_trylock
> [ 98.019610] [ T2643] printk: console_trylock
> [ 98.019634] [ T2643] printk: console_trylock
> [ 98.019636] [ T2643] printk: console_trylock
> [ 98.020082] [ T2643] printk: console_trylock
> [ 98.020130] [ T2643] printk: console_trylock
> [ 98.023697] [ T2643] printk: console_trylock
> [ 98.023730] [ T2643] printk: console_trylock
> [ 98.027334] [ T2643] printk: console_trylock
> [ 98.027369] [ T2643] printk: console_trylock
> [ 98.031044] [ T2643] printk: console_trylock
> [ 98.031074] [ T2643] printk: console_trylock
> [ 98.034191] [ T2643] printk: console_trylock
> [ 98.034221] [ T2643] printk: console_trylock
> [ 98.037379] [ T2643] printk: console_trylock
> [ 98.037405] [ T2643] printk: console_trylock
> [ 98.040576] [ T2643] printk: console_trylock
> [ 98.040599] [ T2643] printk: console_trylock
> [ 98.043786] [ T2643] printk: console_trylock
> [ 98.047294] [ T2643] printk: console_trylock
> [ 98.107363] [ T2643] printk: console_trylock
> [ 98.108733] [ T2643] printk: console_trylock
> [ 98.108748] [ T64] printk: console_trylock
> [ 98.108751] [ T70] printk: console_trylock
> [ 98.108753] [ T70] printk: console_trylock
> [ 98.108862] [ T2660] printk: console_trylock
> [ 98.108869] [ T2665] printk: console_trylock
> [ 98.124749] [ T84] printk: console_lock
> [ 98.124752] [ T84] printk: console_unlock
> [ 98.384581] [ T64] printk: console_trylock
> [ 98.384591] [ T2665] printk: console_trylock
> [ 98.432411] [ T84] printk: console_trylock
> [ 98.442955] [ T272] printk: console_trylock
> [ 98.442980] [ T270] printk: console_trylock
> [ 98.443315] [ T272] printk: console_trylock
> [ 98.444400] [ T270] printk: console_trylock
> [ 98.445386] [ T270] printk: console_trylock
> [ 98.445776] [ T272] printk: console_trylock
> [ 98.446591] [ T270] printk: console_trylock
> [ 98.448456] [ T272] printk: console_trylock
> [ 98.454112] [ T88] printk: console_trylock
> [ 98.456644] [ T270] printk: console_trylock
> [ 98.457620] [ T270] printk: console_trylock
> [ 98.458827] [ T270] printk: console_trylock
> [ 98.460353] [ T84] printk: console_trylock
> [ 98.464803] [ T9] printk: console_trylock
> [ 98.466327] [ T270] printk: console_trylock
> [ 98.470287] [ T272] printk: console_trylock
> [ 98.472759] [ T272] printk: console_trylock
> [ 98.475445] [ T272] printk: console_trylock
> [ 98.491793] [ T272] printk: console_trylock
> [ 98.512594] [ T70] printk: console_trylock
> [ 98.824640] [ T64] printk: console_trylock
> [ 99.006658] [ T2643] printk: console_resume_all
>
> Proc comm values:
> 2643: not exists or not readable
> 2654: kworker/u32:12-async
> 2655: kworker/u32:13-async
> 2656: kworker/u32:14-async
> 2660: kworker/u32:18-events_unbound
> 2665: kworker/u32:23-pm
> 268: scsi_eh_0
> 270: scsi_eh_1
> 272: scsi_eh_2
> 64: kworker/u32:1-events_unbound
> 70: kworker/u32:7-pm
> 76: kauditd
> 84: kworker/1:1-events
> 88: kworker/2:1-cgroup_free
> 9: kworker/0:0-events
>
> ---
>
> Works now!:
> $ sudo sh -c "
> mkdir -p /var/run/nvidia-sleep \
> && echo 2 > /var/run/nvidia-sleep/Xorg.vt_number \
> && chvt 63 \
> && echo suspend >/proc/driver/nvidia/suspend \
> && systemctl suspend"
>
> Logs:
> [ 338.901995] [ T3134] printk: Suspending console(s) (use no_console_suspend to debug)
> [ 338.901997] [ T3134] printk: console_suspend_all
> [ 338.932763] [ T2672] printk: console_trylock
> [ 338.948664] [ T2659] printk: console_trylock
> [ 338.948685] [ T2671] printk: console_trylock
> [ 338.950716] [ T272] printk: console_trylock
> [ 338.950747] [ T270] printk: console_trylock
> [ 338.982194] [ T3134] printk: console_trylock
> [ 339.020910] [ T3134] printk: console_trylock
> [ 339.044613] [ T158] printk: console_trylock
> [ 339.132842] [ T3134] printk: console_trylock
> [ 339.132843] [ T3134] printk: console_trylock
> [ 339.132844] [ T3134] printk: console_trylock
> [ 339.132980] [ T3134] printk: console_trylock
> [ 339.134648] [ T3134] printk: console_trylock
> [ 339.136741] [ T3134] printk: console_trylock
> [ 339.138806] [ T3134] printk: console_trylock
> [ 339.141014] [ T3134] printk: console_trylock
> [ 339.142875] [ T3134] printk: console_trylock
> [ 339.144627] [ T3134] printk: console_trylock
> [ 339.146490] [ T3134] printk: console_trylock
> [ 339.148175] [ T3134] printk: console_trylock
> [ 339.148200] [ T3134] printk: console_trylock
> [ 339.148201] [ T3134] printk: console_trylock
> [ 339.148657] [ T3134] printk: console_trylock
> [ 339.148706] [ T3134] printk: console_trylock
> [ 339.152285] [ T3134] printk: console_trylock
> [ 339.152318] [ T3134] printk: console_trylock
> [ 339.155931] [ T3134] printk: console_trylock
> [ 339.155964] [ T3134] printk: console_trylock
> [ 339.159760] [ T3134] printk: console_trylock
> [ 339.159792] [ T3134] printk: console_trylock
> [ 339.162961] [ T3134] printk: console_trylock
> [ 339.162987] [ T3134] printk: console_trylock
> [ 339.166055] [ T3134] printk: console_trylock
> [ 339.166079] [ T3134] printk: console_trylock
> [ 339.169188] [ T3134] printk: console_trylock
> [ 339.169214] [ T3134] printk: console_trylock
> [ 339.172501] [ T3134] printk: console_trylock
> [ 339.176003] [ T3134] printk: console_trylock
> [ 339.225601] [ T3134] printk: console_trylock
> [ 339.226950] [ T66] printk: console_trylock
> [ 339.226954] [ T66] printk: console_trylock
> [ 339.226957] [ T66] printk: console_trylock
> [ 339.226964] [ T65] printk: console_trylock
> [ 339.226969] [ T3134] printk: console_trylock
> [ 339.227037] [ T2671] printk: console_trylock
> [ 339.227045] [ T2661] printk: console_trylock
> [ 339.254208] [ T9] printk: console_lock
> [ 339.254214] [ T9] printk: console_unlock
> [ 339.502296] [ T2673] printk: console_trylock
> [ 339.506297] [ T67] printk: console_trylock
> [ 339.561605] [ T272] printk: console_trylock
> [ 339.561632] [ T270] printk: console_trylock
> [ 339.562072] [ T272] printk: console_trylock
> [ 339.563175] [ T270] printk: console_trylock
> [ 339.564203] [ T270] printk: console_trylock
> [ 339.564703] [ T272] printk: console_trylock
> [ 339.565094] [ T272] printk: console_trylock
> [ 339.565341] [ T270] printk: console_trylock
> [ 339.570554] [ T1127] printk: console_trylock
> [ 339.571705] [ T272] printk: console_trylock
> [ 339.573017] [ T310] printk: console_trylock
> [ 339.574203] [ T272] printk: console_trylock
> [ 339.575888] [ T270] printk: console_trylock
> [ 339.575899] [ T272] printk: console_trylock
> [ 339.580570] [ T272] printk: console_trylock
> [ 339.582271] [ T270] printk: console_trylock
> [ 339.583508] [ T270] printk: console_trylock
> [ 339.586181] [ T9] printk: console_trylock
> [ 339.591087] [ T270] printk: console_trylock
> [ 339.614444] [ T9] printk: console_trylock
> [ 339.634304] [ T3149] printk: console_trylock
> [ 339.942244] [ T2673] printk: console_trylock
> [ 340.123772] [ T3134] printk: console_resume_all
>
> Proc comm values:
> 1127: kworker/3:2-cgroup_offline
> 158: kworker/5:1-mm_percpu_wq
> 2659: kworker/u32:17-async
> 2661: kworker/u32:19-async
> 2671: kworker/u32:29-async
> 2672: kworker/u32:30-async
> 2673: kworker/u32:31-kvfree_rcu_reclaim
> 270: scsi_eh_1
> 272: scsi_eh_2
> 310: kworker/2:2-events
> 3134: not exists or not readable
> 3149: kworker/u32:38-flush-253:3
> 65: kworker/u32:2-async
> 66: kworker/u32:3-async
> 67: kworker/u32:4-async
> 9: kworker/0:0-events
It is hard to know what is going there. I guess that many
console_trylock() calls are from printk(). But they might also
be from tty or from the nvidia driver code.
I have tried to create a patch which would print backtraces
of the callers. The output might be interesting. I am going
to send it in a separate mail.
> On 2026-01-28, John Ogness wrote:
> > Also, if the patch still has the problem, it would be nice to see the
> > dmesg output with the patch applied when you do only the nvidia
> > suspend/resume and avoid systemctl.
>
> Unfortunately, there is no printk return under these conditions.
>
> When searching for the names of the console_suspend/console_resume and
> console_lock()/console_unlock() functions in the kms module sources,
> I only found two uses of the latter, to which I added logs.
> These functions are only called very early during startup and then never again;
> Maybe this will help.
>
> [ 4.967951] [ T248] NV_API_CALL: os_disable_console_access
> [ 5.164823] [ T248] NV_API_CALL: os_enable_console_access
> [ 6.724895] [ T248] NV_API_CALL: os_disable_console_access
> [ 6.724988] [ T248] NV_API_CALL: os_enable_console_access
>
>
> common/inc/nv-linux.h:
>
> /*
> * Early 2.6 kernels have acquire_console_sem, but from 2.6.38+ it was
> * renamed to console_lock.
> */
> #if defined(NV_ACQUIRE_CONSOLE_SEM_PRESENT)
> #define NV_ACQUIRE_CONSOLE_SEM() acquire_console_sem()
> #define NV_RELEASE_CONSOLE_SEM() release_console_sem()
> #elif defined(NV_CONSOLE_LOCK_PRESENT)
> #define NV_ACQUIRE_CONSOLE_SEM() console_lock()
> #define NV_RELEASE_CONSOLE_SEM() console_unlock()
> #else
> #error "console lock api unrecognized!."
> #endif
>
> nvidia/os-interface.c:
>
> NV_STATUS NV_API_CALL os_disable_console_access(void)
> {
> nv_printf(NV_DBG_ERRORS, "NV_API_CALL: os_disable_console_access\n");
> NV_ACQUIRE_CONSOLE_SEM();
> return NV_OK;
> }
>
> NV_STATUS NV_API_CALL os_enable_console_access(void)
> {
> nv_printf(NV_DBG_ERRORS, "NV_API_CALL: os_enable_console_access\n");
> NV_RELEASE_CONSOLE_SEM();
> return NV_OK;
> }
Good to know. So the nvidia driver synchronizes some operations
using console_lock() as well. And it might be affected by
the modified behavior. For example, before the commit
9e70a5e109a4a2336 ("printk: Add per-console suspended state")
during suspend:
+ console_trylock() never succeeded
+ console_lock() set neither cosnole_locked, nor
console_may_schedule.
...
Best Regards,
Petr
PS: I am going to write some more ideas into another mail with
a new debug patch. I am not sure if I would send it today.
I have to leave in 30 minutes or so...
Powered by blists - more mailing lists