[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a90b46a059d06dbb3ac74445d987668c6a227373.camel@suse.com>
Date: Fri, 29 Aug 2025 09:33:02 -0300
From: Marcos Paulo de Souza <mpdesouza@...e.com>
To: Petr Mladek <pmladek@...e.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Steven Rostedt
<rostedt@...dmis.org>, John Ogness <john.ogness@...utronix.de>, Sergey
Senozhatsky <senozhatsky@...omium.org>, Jason Wessel
<jason.wessel@...driver.com>, Daniel Thompson <danielt@...nel.org>,
Douglas Anderson <dianders@...omium.org>, linux-kernel@...r.kernel.org,
kgdb-bugreport@...ts.sourceforge.net
Subject: Re: [PATCH v2 2/3] printk: nbcon: Introduce KDB helpers
On Thu, 2025-08-28 at 17:26 +0200, Petr Mladek wrote:
> On Mon 2025-08-11 10:32:46, Marcos Paulo de Souza wrote:
> > These helpers will be used when calling console->write_atomic on
> > KDB code in the next patch. It's basically the same implementaion
> > as nbcon_device_try_acquire, but using NBCON_PORIO_EMERGENCY when
> > acquiring the context.
> >
> > For release, differently from nbcon_device_release, we don't need
> > to
> > flush the console, since all CPUs are stopped when KDB is active.
>
> I thought this when we were discussion the code, especially the
> comment in
>
> static void kdb_msg_write(const char *msg, int msg_len)
> {
> [...]
>
> /*
> * The console_srcu_read_lock() only provides safe console
> list
> * traversal. The use of the ->write() callback relies on
> all other
> * CPUs being stopped at the moment and console drivers
> being able to
> * handle reentrance when @oops_in_progress is set.
>
>
> But I see that kdb_msg_write() is called from vkdb_printf() and
> there is the following synchronization:
>
> int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> {
> [...]
>
> /* Serialize kdb_printf if multiple cpus try to write at
> once.
> * But if any cpu goes recursive in kdb, just print the
> output,
> * even if it is interleaved with any other text.
> */
> local_irq_save(flags);
> this_cpu = smp_processor_id();
> for (;;) {
> old_cpu = cmpxchg(&kdb_printf_cpu, -1, this_cpu);
> if (old_cpu == -1 || old_cpu == this_cpu)
> break;
>
> cpu_relax();
> }
>
> It suggests that the code might be used when other CPUs are still
> running.
>
> And for example, kgdb_panic(buf) is called in vpanic() before
> the other CPUs are stopped.
>
>
> My opinion:
>
> It might make sense to flush the console after all. But probably
> only the particular console, see below.
Thanks for being so meticulous and double checking the solution.
>
> > --- a/kernel/printk/nbcon.c
> > +++ b/kernel/printk/nbcon.c
> > @@ -1855,3 +1855,29 @@ void nbcon_device_release(struct console
> > *con)
> > console_srcu_read_unlock(cookie);
> > }
> > EXPORT_SYMBOL_GPL(nbcon_device_release);
> > +
>
> The function must be called only in the very specific kdb
> context, so it would deserve a comment. Inspired by
> nbcon_device_try_acquire():
Yeah... I planned to add documentation on the new functions, but
somehow I forgot until now. I'll take you suggestions and add to the
functions, thanks a lot!
>
> /**
> * nbcon_kdb_try_acquire - Try to acquire nbcon console, enter unsafe
> * section, and initialized nbcon write
> context
> * @con: The nbcon console to acquire
> * @wctxt: The nbcon write context to be used on success
> *
> * Context: Under console_srcu_read_lock() for emiting a single
> kdb message
> * using the given con->write_atomic() callback. Can be
> called
> * only when the console is usable at the moment.
> *
> * Return: True if the console was acquired. False otherwise.
> *
> * kdb emits messages on consoles registered for printk() without
> * storing them into the ring buffer. It has to acquire the console
> * ownerhip so that is could call con->write_atomic() callback a safe
> way.
> *
> * This function acquires the nbcon console using priority
> NBCON_PRIO_EMERGENCY
> * and marks it unsafe for handover/takeover.
> */
>
> > +bool nbcon_kdb_try_acquire(struct console *con,
> > + struct nbcon_write_context *wctxt)
> > +{
> > + struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
> > +
> > + memset(ctxt, 0, sizeof(*ctxt));
> > + ctxt->console = con;
> > + ctxt->prio = NBCON_PRIO_EMERGENCY;
> > +
> > + if (!nbcon_context_try_acquire(ctxt, false))
> > + return false;
> > +
> > + if (!nbcon_context_enter_unsafe(ctxt))
> > + return false;
> > +
> > + return true;
> > +}
> > +
>
> It deserves a comment as well, see below:
>
> > +void nbcon_kdb_release(struct nbcon_write_context *wctxt)
> > +{
> > + struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
> > +
> > + nbcon_context_exit_unsafe(ctxt);
> > + nbcon_context_release(ctxt);
>
> I agree with John that the _release() should be called only when
> exit_unsafe() succeeded.
Done.
>
> Also it might make sense to flush the console. I would do something
> like:
>
>
> /**
> * nbcon_kdb_release - Exit unsafe section and release the nbcon
> console
> *
> * @wctxt: The nbcon write context intialized by a succesful
> * nbcon_kdb_try_acquire()
> *
> * Context: Under console_srcu_read_lock() for emiting a single
> kdb message
> * using the given con->write_atomic() callback. Can be
> called
> * only when the console is usable at the moment.
> */
> void nbcon_kdb_release(struct nbcon_write_context *wctxt)
> {
> struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
>
> nbcon_context_exit_unsafe(ctxt);
> nbcon_context_release(ctxt);
>
> if (!nbcon_context_exit_unsafe(ctxt))
> return;
>
> nbcon_context_release(ctxt);
>
> /*
> * Flush any new printk() messages added when the console
> was blocked.
> * Only the console used by the given write context
> was blocked.
> * The console was locked only when the write_atomic()
> callback
> * was usable.
> */
> __nbcon_atomic_flush_pending_con(ctxt->console,
> prb_next_reserve_seq(prb), false);
Fixed locally, thanks a lot!
>
>
> Best Regards,
> Petr
Powered by blists - more mailing lists