linux-kernel - Re: [PATCH printk v2 08/18] printk: nbcon: Stop threads on shutdown/reboot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <874j9gyexd.fsf@jogness.linutronix.de>
Date: Tue, 25 Jun 2024 22:02:30 +0206
From: John Ogness <john.ogness@...utronix.de>
To: Petr Mladek <pmladek@...e.com>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, Steven Rostedt
 <rostedt@...dmis.org>, Thomas Gleixner <tglx@...utronix.de>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH printk v2 08/18] printk: nbcon: Stop threads on
 shutdown/reboot

On 2024-06-17, Petr Mladek <pmladek@...e.com> wrote:
> First ideas (how the waterfall of questions started):
>
>     + I though that we should do here:
>
> 	    printk_threads_enabled = false;
>
>       It would tell vprintk_emit() to call
>       nbcon_atomic_flush_pending().  And it would be symetric with
>       printk_setup_threads().

This makes sense. I created @printk_threads_enabled to have a clean
startup of the threads. I did not consider recycling it for shutdown
purposes.

>     + Also I though whether to call it before or after stopping the
>       kthreads.
>
>       If we set the variable before stopping kthreads then the
>       kthreads might try to flush the messages in parallel with the
>       flush from vprintk_emit(). It should be safe because it would be
>       serialized via the nbcon console context. But it would be the
>       first scenario where this happen => not much tested.

It is safe. The scenario is not that unusual. Non-printing activities
can contend with thread via the port_lock wrapper quite often.

>       Well, we should start the direct flush before stopping kthreads.
>       So that we could see the messages when the stopping fails.
>
>
> But then I realized:
>
> Ha, vprintk_emit() would call nbcon_atomic_flush_pending() anyway
> because these notifiers are called with:
>
> 	system_state > SYSTEM_RUNNING
>
> This makes me wonder whether we need to stop the kthreads at all.

Well, that overlap was not intentional. There probably should be a flag
to signify the transition rather than just looking at @system_state.

> How exactly does this make printk more robust during shutdown?

Because by stopping them, the printing threads are guaranteed to go
silent before they suddenly freeze. Without the clean shutdown, I could
create shutdown/reboot scenarios where the printing thread would stop
mid-line. Then the atomic printing would never be able to finish because
it is a non-panic situation and the thread was frozen while holding the
context. The result: the user never sees the final printk messages.

> Well, there is one differece. The kthreads could not interferre with
> the direct flush when they are stopped.
>
> But we have the same problem also with suspend/resume. And we do not
> stop the kthreads in this case.

Suspend/resume is something different. Suspend needs to stop _all_
printing. I do not think it makes sense to shutdown threads for
this. The consoles becoming temporarily !usable() is enough.

For shutdown, the consoles are usable() the entire time. I just want the
threads to get out of the way before they freeze so that the user can
see all the messages on shutdown/reboot.

> Maybe, we should just add a check of
>
>        system_state > SYSTEM_RUNNING
>
> also into the nbcon_kthread_func(). Maybe, the kthread should not try
> to flush the messages when they are flushed directly by
> vprintk_emit().

It gets racy when we start relying on the contexts noticing some
variables. By racy, I mean there are scenarios where there are unprinted
records and no context is printing. I think it is easiest when the
following steps occur within a notifier:

1. notifier sets a flag to allow atomic printing from vprintk_emit()

2. notifier stops all threads (with the kthread_should_stop() moved
   inside the printing loop of nbcon_kthread_func())

3. notifier performs nbcon_atomic_flush_pending()

This guarantees that no messages are lost and that all get flushed.

>> +static int __init printk_init_ops(void)
>> +{
>> +	register_syscore_ops(&printk_syscore_ops);
>> +	return 0;
>> +}
>> +device_initcall(printk_init_ops);
>
> I guess that "device_initcall" is cut&pasted ;-) IMHO, it does not
> fit much. That said, I do not have strong opinion where
> it belongs.
>
> IMHO, the best solution would be to register the notifier in
> printk_setup_threads() before we actually allow creating them.

OK.

John Ogness