[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZqixOLkuo0IW2qql@pathway.suse.cz>
Date: Tue, 30 Jul 2024 11:24:25 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH printk v3 03/19] printk: nbcon: Add function for printers
to reacquire ownership
On Mon 2024-07-29 10:42:04, John Ogness wrote:
> On 2024-07-26, Petr Mladek <pmladek@...e.com> wrote:
> > On Mon 2024-07-22 19:25:23, John Ogness wrote:
> >> Since ownership can be lost at any time due to handover or
> >> takeover, a printing context _must_ be prepared to back out
> >> immediately and carefully. However, there are scenarios where
> >> the printing context must reacquire ownership in order to
> >> finalize or revert hardware changes.
> >>
> >> One such example is when interrupts are disabled during
> >> printing. No other context will automagically re-enable the
> >> interrupts. For this case, the disabling context _must_
> >> reacquire nbcon ownership so that it can re-enable the
> >> interrupts.
> >
> > I am still not sure how this is going to be used. It is suspicious.
> > If the context lost the ownership than another started flushing
> > higher priority messages.
> >
> > Is it really safe to manipulate the HW at this point?
> > Won't it break the higher priority context?
>
> Why would it break anything? It spins until it normally and safely
> acquires ownership again. The commit message provides a simple example
> of why it is necessary. With a threaded printer, this situation happens
> almost every time a warning occurs.
I see. It makes sense now.
> >> --- a/kernel/printk/nbcon.c
> >> +++ b/kernel/printk/nbcon.c
> >> @@ -911,6 +948,15 @@ static bool nbcon_emit_next_record(struct nbcon_write_context *wctxt)
> >> return false;
> >> }
> >>
> >> + if (!wctxt->outbuf) {
> >
> > This check works only when con->write_atomic() called
> > nbcon_reacquire_nobuf().
>
> Exactly. That is what it is for.
>
> > At least, we should clear the buffer also in nbcon_enter_unsafe() and
> > nbcon_exit_unsafe() when they realize that they do own the context.
>
> OK.
>
> > Even better would be to add a check whether we still own the context.
> > Something like:
> >
> > bool nbcon_can_proceed(struct nbcon_write_context *wctxt)
> > {
> > struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
> > struct nbcon_state cur;
> >
> > nbcon_state_read(con, &cur);
> >
> > return nbcon_context_can_proceed(ctxt, &cur);
> > }
>
> nbcon_can_proceed() is meant to check ownership. And it only makes sense
> to use it within an unsafe section. Otherwise it is racy.
My idea was: "If we still own the context that we have owned it all
the time and con-write_atomic() succeeded."
The race is is not important. If we lose the ownership before updating
nbcon_seq then the line will get written again anyway.
> Once a reacquire has occurred, the driver is allowed to proceed. It just
> is not allowed to print (because its buffer is gone).
I see. My idea does not work because the driver is going to reacquire
the ownership. It means that nbcon_can_proceed() would return true
even when con->atomic_write() failed.
But it is not documented anywhere. And what if the driver has a bug
and does not call reacquire. Or what if the driver does not need
to restore anything.
IMHO, nbcon_emit_next_record() should check both:
if (use_atomic)
con->write_atomic(con, wctxt);
else
con->write_thread(con, wctxt);
/* Still owns the console? */
if (!nbcon_can_proceed(wctxt)
return false;
if (!wctxt->outbuf) {
/*
* Ownership was lost and reacquired by the driver.
* Handle it as if ownership was lost.
*/
nbcon_context_release(ctxt);
return false;
}
Best Regards,
Petr
Powered by blists - more mailing lists