linux-kernel - Re: [PATCH v3 4/4] kdb: Adapt kdb_msg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aL77aq4gZBsn4epT@pathway.suse.cz>
Date: Mon, 8 Sep 2025 17:51:06 +0200
From: Petr Mladek <pmladek@...e.com>
To: Marcos Paulo de Souza <mpdesouza@...e.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	John Ogness <john.ogness@...utronix.de>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	Jason Wessel <jason.wessel@...driver.com>,
	Daniel Thompson <danielt@...nel.org>,
	Douglas Anderson <dianders@...omium.org>,
	linux-kernel@...r.kernel.org, kgdb-bugreport@...ts.sourceforge.net
Subject: Re: [PATCH v3 4/4] kdb: Adapt kdb_msg_write to work with NBCON
 consoles

On Tue 2025-09-02 15:33:55, Marcos Paulo de Souza wrote:
> Function kdb_msg_write was calling con->write for any found console,
> but it won't work on NBCON ones. In this case we should acquire the
> ownership of the console using NBCON_PRIO_EMERGENCY, since printing
> kdb messages should only be interrupted by a panic

I would end the paragraph here.

> in the case it was
> triggered by sysrq debug option.

This is not important. Also there are more ways how to trigger
panic(). For example, it might happen by an error in the kdb code.
Or I heard rumors that some HW even had a physical "trigger NMI" button.

> This is done by the
> nbcon_kdb_{acquire,release} functions.

I think that this is more or less obvious.

> At this point, the console is required to use the atomic callback. The
> console is skipped if the write_atomic callback is not set or if the
> context could not be acquired. The validation of NBCON is done by the
> console_is_usable helper. The context is released right after
> write_atomic finishes.
> 
> The oops_in_progress handling is only needed in the legacy consoles,
> so it was moved around the con->write callback.

> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -589,24 +589,40 @@ static void kdb_msg_write(const char *msg, int msg_len)
>  	 */
>  	cookie = console_srcu_read_lock();
>  	for_each_console_srcu(c) {
> -		if (!(console_srcu_read_flags(c) & CON_ENABLED))
> +		struct nbcon_write_context wctxt = { };
> +		short flags = console_srcu_read_flags(c);
> +
> +		if (!console_is_usable(c, flags, true))
>  			continue;
>  		if (c == dbg_io_ops->cons)
>  			continue;
> -		if (!c->write)
> -			continue;
> -		/*
> -		 * Set oops_in_progress to encourage the console drivers to
> -		 * disregard their internal spin locks: in the current calling
> -		 * context the risk of deadlock is a bigger problem than risks
> -		 * due to re-entering the console driver. We operate directly on
> -		 * oops_in_progress rather than using bust_spinlocks() because
> -		 * the calls bust_spinlocks() makes on exit are not appropriate
> -		 * for this calling context.
> -		 */
> -		++oops_in_progress;
> -		c->write(c, msg, msg_len);
> -		--oops_in_progress;
> +
> +		if (flags & CON_NBCON) {
> +			/*
> +			 * Do not continue if the console is NBCON and the context
> +			 * can't be acquired.
> +			 */
> +			if (!nbcon_kdb_try_acquire(c, &wctxt))
> +				continue;
> +
> +			wctxt.outbuf = (char *)msg;
> +			wctxt.len = msg_len;

I double checked whether we initialized all members of the structure
correctly. And I found that we didn't. We should call here:

			nbcon_write_context_set_buf(&wctxt,
						    &pmsg.pbufs->outbuf[0],
						    pmsg.outbuf_len);

Sigh, this is easy to miss. I remember that I was not super happy
about this design. But the original code initialized the structure
on a single place...

> +			c->write_atomic(c, &wctxt);
> +			nbcon_kdb_release(&wctxt);
> +		} else {
> +			/*
> +			 * Set oops_in_progress to encourage the console drivers to
> +			 * disregard their internal spin locks: in the current calling
> +			 * context the risk of deadlock is a bigger problem than risks
> +			 * due to re-entering the console driver. We operate directly on
> +			 * oops_in_progress rather than using bust_spinlocks() because
> +			 * the calls bust_spinlocks() makes on exit are not appropriate
> +			 * for this calling context.
> +			 */
> +			++oops_in_progress;
> +			c->write(c, msg, msg_len);
> +			--oops_in_progress;
> +		}
>  		touch_nmi_watchdog();
>  	}
>  	console_srcu_read_unlock(cookie);

Best Regards,
Petr