linux-kernel - Re: [PATCH printk v1 1/1] printk: Avoid scheduling irq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRWqei9jA8gcM-sD@pathway.suse.cz>
Date: Thu, 13 Nov 2025 10:52:58 +0100
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Sherry Sun <sherry.sun@....com>, Jacky Bai <ping.bai@....com>,
	Jon Hunter <jonathanh@...dia.com>,
	Thierry Reding <thierry.reding@...il.com>,
	Derek Barbosa <debarbos@...hat.com>, linux-kernel@...r.kernel.org,
	stable@...r.kernel.org
Subject: Re: [PATCH printk v1 1/1] printk: Avoid scheduling irq_work on
 suspend

On Wed 2025-11-12 16:56:22, John Ogness wrote:
> On 2025-11-11, Petr Mladek <pmladek@...e.com> wrote:
> >> Introduce a new global variable @console_offload_blocked to flag
> >> when irq_work queueing is to be avoided. The flag is used by
> >> printk_get_console_flush_type() to avoid allowing deferred printing
> >> and switch NBCON consoles to atomic flushing. It is also used by
> >> vprintk_emit() to avoid klogd waking.
> >> 
> >> --- a/kernel/printk/internal.h
> >> +++ b/kernel/printk/internal.h
> >> @@ -230,6 +230,8 @@ struct console_flush_type {
> >>  	bool	legacy_offload;
> >>  };
> >>  
> >> +extern bool console_irqwork_blocked;
> >> +
> >>  /*
> >>   * Identify which console flushing methods should be used in the context of
> >>   * the caller.
> >> @@ -241,7 +243,7 @@ static inline void printk_get_console_flush_type(struct console_flush_type *ft)
> >>  	switch (nbcon_get_default_prio()) {
> >>  	case NBCON_PRIO_NORMAL:
> >>  		if (have_nbcon_console && !have_boot_console) {
> >> -			if (printk_kthreads_running)
> >> +			if (printk_kthreads_running && !console_irqwork_blocked)
> >>  				ft->nbcon_offload = true;
> >>  			else
> >>  				ft->nbcon_atomic = true;
> >> @@ -251,7 +253,7 @@ static inline void printk_get_console_flush_type(struct console_flush_type *ft)
> >>  		if (have_legacy_console || have_boot_console) {
> >>  			if (!is_printk_legacy_deferred())
> >>  				ft->legacy_direct = true;
> >> -			else
> >> +			else if (!console_irqwork_blocked)
> >>  				ft->legacy_offload = true;
> >>  		}
> >>  		break;
> >
> > This is one possibility.
> >
> > Another possibility would be to block the irq work
> > directly in defer_console_output() and wake_up_klogd().
> > It would handle all situations, including printk_trigger_flush()
> > or defer_console_output().
> >
> > Or is there any reason, why these two call paths are not handled?
> 
> My intention was to focus only on irq_work triggered directly by
> printk() calls as well as transitioning NBCON from threaded to direct.
> 
> > I do not have strong opinion. This patch makes it more explicit
> > when defer_console_output() or wake_up_klogd() is called.
> >
> > If we move the check into defer_console_output() or wake_up_klogd(),
> > it would hide the behavior. But it will make the API more safe
> > to use. And wake_up_klogd() is even exported via <linux/printk.h>.
> 
> Earlier test versions of this patch did exactly as you are suggesting
> here. But I felt like "properly avoiding" deferred/offloaded printing
> via printk_get_console_flush_type() (rather than just hacking
> irq_work_queue() callers to abort) was a cleaner solution. Especially
> since printk_get_console_flush_type() modifications were needed anyway
> in order to transition NBCON from threaded to direct.

I see.

> >> @@ -264,7 +266,7 @@ static inline void printk_get_console_flush_type(struct console_flush_type *ft)
> >>  		if (have_legacy_console || have_boot_console) {
> >>  			if (!is_printk_legacy_deferred())
> >>  				ft->legacy_direct = true;
> >> -			else
> >> +			else if (!console_irqwork_blocked)
> >>  				ft->legacy_offload = true;
> >
> > This change won't be needed if we move the check into
> > defer_console_output() and wake_up_klogd().
> >
> >>  		}
> >>  		break;
> >> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> >> index 5aee9ffb16b9a..94fc4a8662d4b 100644
> >> --- a/kernel/printk/printk.c
> >> +++ b/kernel/printk/printk.c
> >> @@ -2426,7 +2429,7 @@ asmlinkage int vprintk_emit(int facility, int level,
> >>  
> >>  	if (ft.legacy_offload)
> >>  		defer_console_output();
> >> -	else
> >> +	else if (!console_irqwork_blocked)
> >>  		wake_up_klogd();
> >
> > Same here.
> >
> >>  
> >>  	return printed_len;
> 
> I would prefer to keep all the printk_get_console_flush_type() changes
> since it returns proper available flush type information. If you would
> like to _additionally_ short-circuit __wake_up_klogd() and
> nbcon_kthreads_wake() in order to avoid all possible irq_work queueing,
> I would be OK with that.

Combining both approaches might be a bit messy. Some code paths might
work correctly because of the explicit check and some just by chance.

But I got an idea. We could add a warning into __wake_up_klogd()
and nbcon_kthreads_wake() to report when they are called unexpectedly.

And we should also prevent calling it from lib/nmi_backtrace.c.

I played with it and came up with the following changes on
top of this patch:

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 45c663124c9b..71e31b908ad1 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -203,6 +203,7 @@ void dump_stack_print_info(const char *log_lvl);
 void show_regs_print_info(const char *log_lvl);
 extern asmlinkage void dump_stack_lvl(const char *log_lvl) __cold;
 extern asmlinkage void dump_stack(void) __cold;
+void printk_try_flush(void);
 void printk_trigger_flush(void);
 void console_try_replay_all(void);
 void printk_legacy_allow_panic_sync(void);
@@ -299,6 +300,9 @@ static inline void dump_stack_lvl(const char *log_lvl)
 static inline void dump_stack(void)
 {
 }
+static inline void printk_try_flush(void)
+{
+}
 static inline void printk_trigger_flush(void)
 {
 }
diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
index ffd5a2593306..a09b8502e507 100644
--- a/kernel/printk/nbcon.c
+++ b/kernel/printk/nbcon.c
@@ -1302,6 +1302,13 @@ void nbcon_kthreads_wake(void)
 	if (!printk_kthreads_running)
 		return;
 
+	/*
+	 * Nobody is allowed to call this function when console irq_work
+	 * is blocked.
+	 */
+	if (WARN_ON_ONCE(console_irqwork_blocked))
+		return;
+
 	cookie = console_srcu_read_lock();
 	for_each_console_srcu(con) {
 		if (!(console_srcu_read_flags(con) & CON_NBCON))
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 334b4edff08c..e9290c418d12 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -4581,6 +4581,13 @@ static void __wake_up_klogd(int val)
 	if (!printk_percpu_data_ready())
 		return;
 
+	/*
+	 * Nobody is allowed to call this function when console irq_work
+	 * is blocked.
+	 */
+	if (WARN_ON_ONCE(console_irqwork_blocked))
+		return;
+
 	preempt_disable();
 	/*
 	 * Guarantee any new records can be seen by tasks preparing to wait
@@ -4637,6 +4644,21 @@ void defer_console_output(void)
 	__wake_up_klogd(PRINTK_PENDING_WAKEUP | PRINTK_PENDING_OUTPUT);
 }
 
+void printk_try_flush(void)
+{
+	struct console_flush_type ft;
+
+	printk_get_console_flush_type(&ft);
+	if (ft.nbcon_atomic)
+		nbcon_atomic_flush_pending();
+	if (ft.legacy_direct) {
+		if (console_trylock())
+			console_unlock();
+	}
+	if (ft.legacy_offload)
+		defer_console_output();
+}
+
 void printk_trigger_flush(void)
 {
 	defer_console_output();
diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index 33c154264bfe..632bbc28cb79 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -78,10 +78,10 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
 	nmi_backtrace_stall_check(to_cpumask(backtrace_mask));
 
 	/*
-	 * Force flush any remote buffers that might be stuck in IRQ context
+	 * Try flushing messages added CPUs which might be stuck in IRQ context
 	 * and therefore could not run their irq_work.
 	 */
-	printk_trigger_flush();
+	printk_try_flush();
 
 	clear_bit_unlock(0, &backtrace_flag);
 	put_cpu();

How does it look, please?

Best Regards,
Petr