[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aLB1FCc-IJNxjgIy@pathway.suse.cz>
Date: Thu, 28 Aug 2025 17:26:12 +0200
From: Petr Mladek <pmladek@...e.com>
To: Marcos Paulo de Souza <mpdesouza@...e.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Steven Rostedt <rostedt@...dmis.org>,
John Ogness <john.ogness@...utronix.de>,
Sergey Senozhatsky <senozhatsky@...omium.org>,
Jason Wessel <jason.wessel@...driver.com>,
Daniel Thompson <danielt@...nel.org>,
Douglas Anderson <dianders@...omium.org>,
linux-kernel@...r.kernel.org, kgdb-bugreport@...ts.sourceforge.net
Subject: Re: [PATCH v2 2/3] printk: nbcon: Introduce KDB helpers
On Mon 2025-08-11 10:32:46, Marcos Paulo de Souza wrote:
> These helpers will be used when calling console->write_atomic on
> KDB code in the next patch. It's basically the same implementaion
> as nbcon_device_try_acquire, but using NBCON_PORIO_EMERGENCY when
> acquiring the context.
>
> For release, differently from nbcon_device_release, we don't need to
> flush the console, since all CPUs are stopped when KDB is active.
I thought this when we were discussion the code, especially the
comment in
static void kdb_msg_write(const char *msg, int msg_len)
{
[...]
/*
* The console_srcu_read_lock() only provides safe console list
* traversal. The use of the ->write() callback relies on all other
* CPUs being stopped at the moment and console drivers being able to
* handle reentrance when @oops_in_progress is set.
But I see that kdb_msg_write() is called from vkdb_printf() and
there is the following synchronization:
int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
{
[...]
/* Serialize kdb_printf if multiple cpus try to write at once.
* But if any cpu goes recursive in kdb, just print the output,
* even if it is interleaved with any other text.
*/
local_irq_save(flags);
this_cpu = smp_processor_id();
for (;;) {
old_cpu = cmpxchg(&kdb_printf_cpu, -1, this_cpu);
if (old_cpu == -1 || old_cpu == this_cpu)
break;
cpu_relax();
}
It suggests that the code might be used when other CPUs are still
running.
And for example, kgdb_panic(buf) is called in vpanic() before
the other CPUs are stopped.
My opinion:
It might make sense to flush the console after all. But probably
only the particular console, see below.
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1855,3 +1855,29 @@ void nbcon_device_release(struct console *con)
> console_srcu_read_unlock(cookie);
> }
> EXPORT_SYMBOL_GPL(nbcon_device_release);
> +
The function must be called only in the very specific kdb
context, so it would deserve a comment. Inspired by
nbcon_device_try_acquire():
/**
* nbcon_kdb_try_acquire - Try to acquire nbcon console, enter unsafe
* section, and initialized nbcon write context
* @con: The nbcon console to acquire
* @wctxt: The nbcon write context to be used on success
*
* Context: Under console_srcu_read_lock() for emiting a single kdb message
* using the given con->write_atomic() callback. Can be called
* only when the console is usable at the moment.
*
* Return: True if the console was acquired. False otherwise.
*
* kdb emits messages on consoles registered for printk() without
* storing them into the ring buffer. It has to acquire the console
* ownerhip so that is could call con->write_atomic() callback a safe way.
*
* This function acquires the nbcon console using priority NBCON_PRIO_EMERGENCY
* and marks it unsafe for handover/takeover.
*/
> +bool nbcon_kdb_try_acquire(struct console *con,
> + struct nbcon_write_context *wctxt)
> +{
> + struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
> +
> + memset(ctxt, 0, sizeof(*ctxt));
> + ctxt->console = con;
> + ctxt->prio = NBCON_PRIO_EMERGENCY;
> +
> + if (!nbcon_context_try_acquire(ctxt, false))
> + return false;
> +
> + if (!nbcon_context_enter_unsafe(ctxt))
> + return false;
> +
> + return true;
> +}
> +
It deserves a comment as well, see below:
> +void nbcon_kdb_release(struct nbcon_write_context *wctxt)
> +{
> + struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
> +
> + nbcon_context_exit_unsafe(ctxt);
> + nbcon_context_release(ctxt);
I agree with John that the _release() should be called only when
exit_unsafe() succeeded.
Also it might make sense to flush the console. I would do something
like:
/**
* nbcon_kdb_release - Exit unsafe section and release the nbcon console
*
* @wctxt: The nbcon write context intialized by a succesful
* nbcon_kdb_try_acquire()
*
* Context: Under console_srcu_read_lock() for emiting a single kdb message
* using the given con->write_atomic() callback. Can be called
* only when the console is usable at the moment.
*/
void nbcon_kdb_release(struct nbcon_write_context *wctxt)
{
struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
nbcon_context_exit_unsafe(ctxt);
nbcon_context_release(ctxt);
if (!nbcon_context_exit_unsafe(ctxt))
return;
nbcon_context_release(ctxt);
/*
* Flush any new printk() messages added when the console was blocked.
* Only the console used by the given write context was blocked.
* The console was locked only when the write_atomic() callback
* was usable.
*/
__nbcon_atomic_flush_pending_con(ctxt->console, prb_next_reserve_seq(prb), false);
Best Regards,
Petr
Powered by blists - more mailing lists