[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6035c35f72eb1ac8817396ca08aae71d097ca42c.camel@suse.com>
Date: Mon, 11 Aug 2025 16:19:34 -0300
From: Marcos Paulo de Souza <mpdesouza@...e.com>
To: Daniel Thompson <daniel@...cstar.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Petr Mladek
<pmladek@...e.com>, Steven Rostedt <rostedt@...dmis.org>, John Ogness
<john.ogness@...utronix.de>, Sergey Senozhatsky <senozhatsky@...omium.org>,
Jason Wessel <jason.wessel@...driver.com>, Daniel Thompson
<danielt@...nel.org>, Douglas Anderson <dianders@...omium.org>,
linux-kernel@...r.kernel.org, kgdb-bugreport@...ts.sourceforge.net
Subject: Re: [PATCH v2 3/3] kdb: Adapt kdb_msg_write to work with NBCON
consoles
On Mon, 2025-08-11 at 15:38 +0100, Daniel Thompson wrote:
> Hi Marcos
>
> No objections, but a couple of questions if I may...
np at all, thanks for looking at the code!
>
>
> On Mon, Aug 11, 2025 at 10:32:47AM -0300, Marcos Paulo de Souza
> wrote:
> > Function kdb_msg_write was calling con->write for any found
> > console,
> > but it won't work on NBCON ones. In this case we should acquire the
> > ownership of the console using NBCON_PRIO_EMERGENCY, since printing
> > kdb messages should only be interrupted by a panic. This is done by
> > the
> > nbcon_kdb_{acquire,release} functions.
>
> Just wanted to check what it means to be "interrupted by a panic"?
>
> kdb is called from the panic handler but, assuming the serial port is
> run
> syncrhonously when "bad things are happening", the serial port should
> be
> quiet when we enter kdb meaning we can still acquire ownership of the
> console?
TBH I haven't thought about that. I talked with Petr Mladek about it,
and we agreed to have something similar to nbcon_device_try_acquire,
but with a higher priority, to make sure that we would get the context
at this point. But, when thinking about it, since KDB runs on the panic
handler, so we I'm not sure even if we need the _enter_unsafe() call at
this point, since nobody is going to interrupt either way.
About the try_acquire part, I haven't checked about the state of the
console devices when the panic happens, if they drop the ownership of
the console on non-panic CPUs or not, so I'm not sure if this is needed
or not. I'll wait for Petr and/or maybe John to add some info.
> >
> > At this point, the console is required to use the atomic callback.
> > The
> > console is skipped if the write_atomic callback is not set or if
> > the
> > context could not be acquired. The validation of NBCON is done by
> > the
> > console_is_usable helper. The context is released right after
> > write_atomic finishes.
> >
> > Suggested-by: Petr Mladek <pmladek@...e.com>
> > Signed-off-by: Marcos Paulo de Souza <mpdesouza@...e.com>
> > ---
> > kernel/debug/kdb/kdb_io.c | 26 ++++++++++++++++++++++----
> > 1 file changed, 22 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> > index
> > 9b11b10b120cf07e451a7a4d92ce50f9a6c066b2..74f6d4316bdc9d3c4f6d4252b
> > f425e33cce65a87 100644
> > --- a/kernel/debug/kdb/kdb_io.c
> > +++ b/kernel/debug/kdb/kdb_io.c
> > @@ -589,12 +589,23 @@ static void kdb_msg_write(const char *msg,
> > int msg_len)
> > */
> > cookie = console_srcu_read_lock();
> > for_each_console_srcu(c) {
> > - if (!(console_srcu_read_flags(c) & CON_ENABLED))
> > + struct nbcon_write_context wctxt = { };
> > + short flags = console_srcu_read_flags(c);
> > +
> > + if (!console_is_usable(c, flags, true))
> > continue;
> > if (c == dbg_io_ops->cons)
> > continue;
> > - if (!c->write)
> > - continue;
> > +
> > + /*
> > + * Do not continue if the console is NBCON and the
> > context
> > + * can't be acquired.
> > + */
> > + if (flags & CON_NBCON) {
> > + if (!nbcon_kdb_try_acquire(c, &wctxt))
> > + continue;
> > + }
> > +
> > /*
> > * Set oops_in_progress to encourage the console
> > drivers to
> > * disregard their internal spin locks: in the
> > current calling
> > @@ -605,7 +616,14 @@ static void kdb_msg_write(const char *msg, int
> > msg_len)
> > * for this calling context.
> > */
> > ++oops_in_progress;
>
> Dumb question, but is the oops_in_progress bump still useful with
> atomic
> consoles? Will the console have any spinlocks to disregard or will
> the
> console ownership code already handled any mutual exclusion issues
> meaning
> there should be no spinlocks taking by the atomic write handler?
>
IIUC, since we can have multiple consoles, and some of them are NB and
others aren't, I believe that this oops_in_progress is still useful.
>
>
> Thanks
>
> Daniel.
Powered by blists - more mailing lists