[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877bvukzs1.fsf@jogness.linutronix.de>
Date: Thu, 13 Nov 2025 11:17:42 +0106
From: John Ogness <john.ogness@...utronix.de>
To: Petr Mladek <pmladek@...e.com>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, Steven Rostedt
<rostedt@...dmis.org>, Sherry Sun <sherry.sun@....com>, Jacky Bai
<ping.bai@....com>, Jon Hunter <jonathanh@...dia.com>, Thierry Reding
<thierry.reding@...il.com>, Derek Barbosa <debarbos@...hat.com>,
linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH printk v1 1/1] printk: Avoid scheduling irq_work on suspend
On 2025-11-13, Petr Mladek <pmladek@...e.com> wrote:
>> I would prefer to keep all the printk_get_console_flush_type() changes
>> since it returns proper available flush type information. If you would
>> like to _additionally_ short-circuit __wake_up_klogd() and
>> nbcon_kthreads_wake() in order to avoid all possible irq_work queueing,
>> I would be OK with that.
>
> Combining both approaches might be a bit messy. Some code paths might
> work correctly because of the explicit check and some just by chance.
>
> But I got an idea. We could add a warning into __wake_up_klogd()
> and nbcon_kthreads_wake() to report when they are called unexpectedly.
>
> And we should also prevent calling it from lib/nmi_backtrace.c.
>
> I played with it and came up with the following changes on
> top of this patch:
>
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 45c663124c9b..71e31b908ad1 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -203,6 +203,7 @@ void dump_stack_print_info(const char *log_lvl);
> void show_regs_print_info(const char *log_lvl);
> extern asmlinkage void dump_stack_lvl(const char *log_lvl) __cold;
> extern asmlinkage void dump_stack(void) __cold;
> +void printk_try_flush(void);
> void printk_trigger_flush(void);
> void console_try_replay_all(void);
> void printk_legacy_allow_panic_sync(void);
> @@ -299,6 +300,9 @@ static inline void dump_stack_lvl(const char *log_lvl)
> static inline void dump_stack(void)
> {
> }
> +static inline void printk_try_flush(void)
> +{
> +}
> static inline void printk_trigger_flush(void)
> {
> }
> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
> index ffd5a2593306..a09b8502e507 100644
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1302,6 +1302,13 @@ void nbcon_kthreads_wake(void)
> if (!printk_kthreads_running)
> return;
>
> + /*
> + * Nobody is allowed to call this function when console irq_work
> + * is blocked.
> + */
> + if (WARN_ON_ONCE(console_irqwork_blocked))
> + return;
> +
> cookie = console_srcu_read_lock();
> for_each_console_srcu(con) {
> if (!(console_srcu_read_flags(con) & CON_NBCON))
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 334b4edff08c..e9290c418d12 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -4581,6 +4581,13 @@ static void __wake_up_klogd(int val)
> if (!printk_percpu_data_ready())
> return;
>
> + /*
> + * Nobody is allowed to call this function when console irq_work
> + * is blocked.
> + */
> + if (WARN_ON_ONCE(console_irqwork_blocked))
> + return;
> +
> preempt_disable();
> /*
> * Guarantee any new records can be seen by tasks preparing to wait
> @@ -4637,6 +4644,21 @@ void defer_console_output(void)
> __wake_up_klogd(PRINTK_PENDING_WAKEUP | PRINTK_PENDING_OUTPUT);
> }
>
> +void printk_try_flush(void)
> +{
> + struct console_flush_type ft;
> +
> + printk_get_console_flush_type(&ft);
> + if (ft.nbcon_atomic)
> + nbcon_atomic_flush_pending();
For completeness, we should probably also have here:
if (ft.nbcon_offload)
nbcon_kthreads_wake();
> + if (ft.legacy_direct) {
> + if (console_trylock())
> + console_unlock();
> + }
> + if (ft.legacy_offload)
> + defer_console_output();
> +}
> +
> void printk_trigger_flush(void)
> {
> defer_console_output();
> diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
> index 33c154264bfe..632bbc28cb79 100644
> --- a/lib/nmi_backtrace.c
> +++ b/lib/nmi_backtrace.c
> @@ -78,10 +78,10 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
> nmi_backtrace_stall_check(to_cpumask(backtrace_mask));
>
> /*
> - * Force flush any remote buffers that might be stuck in IRQ context
> + * Try flushing messages added CPUs which might be stuck in IRQ context
> * and therefore could not run their irq_work.
> */
> - printk_trigger_flush();
> + printk_try_flush();
>
> clear_bit_unlock(0, &backtrace_flag);
> put_cpu();
>
> How does it look, please?
I like this. But I think the printk_try_flush() implementation should
simply replace the implementation of printk_trigger_flush().
For the arch/powerpc/kernel/watchdog.c:watchdog_timer_interrupt() and
lib/nmi_backtrace.c:nmi_trigger_cpumask_backtrace() sites I think it is
appropriate.
For the kernel/printk/nbcon.c:nbcon_device_release() site I think the
call should be changed to defer_console_output():
diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
index 558ef31779760..73f315fd97a3e 100644
--- a/kernel/printk/nbcon.c
+++ b/kernel/printk/nbcon.c
@@ -1849,7 +1849,7 @@ void nbcon_device_release(struct console *con)
if (console_trylock())
console_unlock();
} else if (ft.legacy_offload) {
- printk_trigger_flush();
+ defer_console_output();
}
}
console_srcu_read_unlock(cookie);
Can I fold all that into a new patch?
John
Powered by blists - more mailing lists