lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aL77aq4gZBsn4epT@pathway.suse.cz>
Date: Mon, 8 Sep 2025 17:51:06 +0200
From: Petr Mladek <pmladek@...e.com>
To: Marcos Paulo de Souza <mpdesouza@...e.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	John Ogness <john.ogness@...utronix.de>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	Jason Wessel <jason.wessel@...driver.com>,
	Daniel Thompson <danielt@...nel.org>,
	Douglas Anderson <dianders@...omium.org>,
	linux-kernel@...r.kernel.org, kgdb-bugreport@...ts.sourceforge.net
Subject: Re: [PATCH v3 4/4] kdb: Adapt kdb_msg_write to work with NBCON
 consoles

On Tue 2025-09-02 15:33:55, Marcos Paulo de Souza wrote:
> Function kdb_msg_write was calling con->write for any found console,
> but it won't work on NBCON ones. In this case we should acquire the
> ownership of the console using NBCON_PRIO_EMERGENCY, since printing
> kdb messages should only be interrupted by a panic

I would end the paragraph here.

> in the case it was
> triggered by sysrq debug option.

This is not important. Also there are more ways how to trigger
panic(). For example, it might happen by an error in the kdb code.
Or I heard rumors that some HW even had a physical "trigger NMI" button.

> This is done by the
> nbcon_kdb_{acquire,release} functions.

I think that this is more or less obvious.

> At this point, the console is required to use the atomic callback. The
> console is skipped if the write_atomic callback is not set or if the
> context could not be acquired. The validation of NBCON is done by the
> console_is_usable helper. The context is released right after
> write_atomic finishes.
> 
> The oops_in_progress handling is only needed in the legacy consoles,
> so it was moved around the con->write callback.

> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -589,24 +589,40 @@ static void kdb_msg_write(const char *msg, int msg_len)
>  	 */
>  	cookie = console_srcu_read_lock();
>  	for_each_console_srcu(c) {
> -		if (!(console_srcu_read_flags(c) & CON_ENABLED))
> +		struct nbcon_write_context wctxt = { };
> +		short flags = console_srcu_read_flags(c);
> +
> +		if (!console_is_usable(c, flags, true))
>  			continue;
>  		if (c == dbg_io_ops->cons)
>  			continue;
> -		if (!c->write)
> -			continue;
> -		/*
> -		 * Set oops_in_progress to encourage the console drivers to
> -		 * disregard their internal spin locks: in the current calling
> -		 * context the risk of deadlock is a bigger problem than risks
> -		 * due to re-entering the console driver. We operate directly on
> -		 * oops_in_progress rather than using bust_spinlocks() because
> -		 * the calls bust_spinlocks() makes on exit are not appropriate
> -		 * for this calling context.
> -		 */
> -		++oops_in_progress;
> -		c->write(c, msg, msg_len);
> -		--oops_in_progress;
> +
> +		if (flags & CON_NBCON) {
> +			/*
> +			 * Do not continue if the console is NBCON and the context
> +			 * can't be acquired.
> +			 */
> +			if (!nbcon_kdb_try_acquire(c, &wctxt))
> +				continue;
> +
> +			wctxt.outbuf = (char *)msg;
> +			wctxt.len = msg_len;

I double checked whether we initialized all members of the structure
correctly. And I found that we didn't. We should call here:

			nbcon_write_context_set_buf(&wctxt,
						    &pmsg.pbufs->outbuf[0],
						    pmsg.outbuf_len);

Sigh, this is easy to miss. I remember that I was not super happy
about this design. But the original code initialized the structure
on a single place...

> +			c->write_atomic(c, &wctxt);
> +			nbcon_kdb_release(&wctxt);
> +		} else {
> +			/*
> +			 * Set oops_in_progress to encourage the console drivers to
> +			 * disregard their internal spin locks: in the current calling
> +			 * context the risk of deadlock is a bigger problem than risks
> +			 * due to re-entering the console driver. We operate directly on
> +			 * oops_in_progress rather than using bust_spinlocks() because
> +			 * the calls bust_spinlocks() makes on exit are not appropriate
> +			 * for this calling context.
> +			 */
> +			++oops_in_progress;
> +			c->write(c, msg, msg_len);
> +			--oops_in_progress;
> +		}
>  		touch_nmi_watchdog();
>  	}
>  	console_srcu_read_unlock(cookie);

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ