[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZnwGOaSLTkIB0csA@pathway.suse.cz>
Date: Wed, 26 Jun 2024 14:14:49 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH printk v2 08/18] printk: nbcon: Stop threads on
shutdown/reboot
On Tue 2024-06-25 22:02:30, John Ogness wrote:
> On 2024-06-17, Petr Mladek <pmladek@...e.com> wrote:
> > This makes me wonder whether we need to stop the kthreads at all.
>
> > How exactly does this make printk more robust during shutdown?
>
> Because by stopping them, the printing threads are guaranteed to go
> silent before they suddenly freeze. Without the clean shutdown, I could
> create shutdown/reboot scenarios where the printing thread would stop
> mid-line. Then the atomic printing would never be able to finish because
> it is a non-panic situation and the thread was frozen while holding the
> context. The result: the user never sees the final printk messages.
I see.
> > Well, there is one differece. The kthreads could not interferre with
> > the direct flush when they are stopped.
> >
> > But we have the same problem also with suspend/resume. And we do not
> > stop the kthreads in this case.
>
> Suspend/resume is something different. Suspend needs to stop _all_
> printing. I do not think it makes sense to shutdown threads for
> this. The consoles becoming temporarily !usable() is enough.
Just to be sure that we are on the same page. I made a closer
look at the suspend code. I see 4 important milestones:
1. suspend_console();
The consoles are intentionally stopped after this point.
It allows to suspend the related hardware, for example,
the serial port.
2. pm_sleep_disable_secondary_cpus();
Other CPUs are stopped at this point. The kthread might still
be theoretically scheduled.
3. local_irq_disable();
IRQs get disabled. Neither the hardware nor the kthread could
work after this point.
4. system_state = SYSTEM_SUSPEND;
The global value is set. vprintk_emit() starts calling
nbcon_atomic_flush_pending().
>From some reasons, I thought that ordering was different and
the "system_state" was set much earlier. This is why I thought
that it would make sense to "stop" the kthreads.
But it seems that, in reality, the nbcon_atomic_flush_pending() in
vprintk_emit() would never do anything during suspend. The consoles
are already suspended (unusable) when the system_state gets set
to SYSTEM_SUSPEND.
It means that we do not need to synchronize the printk kthreads with
vprintk_emit() for suspend. The only important thing is
the pr_flush() in suspend_console().
> For shutdown, the consoles are usable() the entire time. I just want the
> threads to get out of the way before they freeze so that the user can
> see all the messages on shutdown/reboot.
I see what you mean now. The consoles are not suspended => they are
usable all the time.
> > Maybe, we should just add a check of
> >
> > system_state > SYSTEM_RUNNING
> >
> > also into the nbcon_kthread_func(). Maybe, the kthread should not try
> > to flush the messages when they are flushed directly by
> > vprintk_emit().
>
> It gets racy when we start relying on the contexts noticing some
> variables. By racy, I mean there are scenarios where there are unprinted
> records and no context is printing. I think it is easiest when the
> following steps occur within a notifier:
>
> 1. notifier sets a flag to allow atomic printing from vprintk_emit()
>
> 2. notifier stops all threads (with the kthread_should_stop() moved
> inside the printing loop of nbcon_kthread_func())
>
> 3. notifier performs nbcon_atomic_flush_pending()
>
> This guarantees that no messages are lost and that all get flushed.
Makes sense.
Best Regards,
Petr
Powered by blists - more mailing lists