lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 11 Jun 2024 16:51:00 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH printk v2 04/18] printk: nbcon: Introduce printing
 kthreads

On Mon 2024-06-10 14:15:10, John Ogness wrote:
> On 2024-06-07, Petr Mladek <pmladek@...e.com> wrote:
> > I updated this commit message in parallel with commenting the related
> > code changes. My later comment might explain the motivation for
> > writing the commit message this way.
> 
> I am OK with your proposed commit message. Thank you for taking the time
> to fill in all the details.

We should probably update it a bit so that it describes the role
of the device_lock() more precisely. See the discussion below.
> > <proposal>
> 
> [...]
> 
> > The write context in the kthread will stay safe only when either of
> > the following conditions are true:
> >
> >   1. The kthread is the only context which acquires the console
> >      with NBCON_PRIO_NORMAL.
> >
> >   2. Other caller acquires the console context with NBCON_PRIO_NORMAL
> >      under the device_lock.
> >
> >   3. Other caller acquires the console context with NBCON_PRIO_NORMAL
> >      with disabled preemption. It will release the context before
> >      rescheduling.
> >
> > It is even double guaranteed. First, __nbcon_atomic_flush_pending_con()
> > is called:
> >
> >   + with disabled interrupts from nbcon_atomic_flush_pending_con()
> >   + under device_lock() from nbcon_device_release()
> 
> [...]
> 
> > </proposal>
> >
> > BTW: It really looks like the safety is double guaranteed. Maybe
> >      the con->device_lock() is not needed in nbcon_kthread_func()
> >      after all. Well, I would keep it to be on the safe side.
> 
> For the threaded case it is technically correct to say the safety is
> double guaranteed. But the outer safety (device_lock()) can provide a
> preemptible lock, which is the key for the non-interference property of
> the threaded printers.
> 
> For example, for an nbcon framebuffer console, device_lock() most likely
> will be a mutex.
> 
> During normal operation, the inner safety (console context) will never
> be contended. That really only exists to synchronize the atomic case.

I see. For example, the spin_lock_irqsave() in __uart_port_lock_irqsave()
will be sleeping lock in RT => preserving RT guarantees. And it will
make sure that the inner nbcon_context_try_acquire() would never
have to spin => never break the RT guarantees.

It makes perfect sense. Do I get it correctly?

> >> --- a/include/linux/console.h
> >> +++ b/include/linux/console.h
> >> @@ -373,6 +376,27 @@ struct console {
> >>  	 */
> >>  	void (*write_atomic)(struct console *con, struct nbcon_write_context *wctxt);
> >>  
> >> +	/**
> >> +	 * @write_thread:
> >> +	 *
> >> +	 * NBCON callback to write out text in task context.
> >> +	 *
> >> +	 * This callback is called after device_lock() and with the nbcon
> >> +	 * console acquired. Any necessary driver synchronization should have
> >> +	 * been performed by the device_lock() callback.
> >> +	 *
> >> +	 * This callback is always called from task context but with migration
> >> +	 * disabled.
> >> +	 *
> >> +	 * The same criteria for console ownership verification and unsafe
> >> +	 * sections applies as with write_atomic(). The difference between
> >> +	 * this callback and write_atomic() is that this callback is used
> >> +	 * during normal operation and is always called from task context.
> >> +	 * This allows drivers to operate in their own locking context for
> >> +	 * synchronizing output to the hardware.
> >> +	 */
> >
> > The description is not bad. It seems to include all the important
> > information. Except than I still needed to scratch my head around it
> > to understand the real purpose and rules.
> >
> > Namely, the following sentences are kind of vague and bring questions
> > which I had in the past before I understood all the relations:
> >
> >    1. "Any necessary driver synchronization should have been performed
> >        by the device_lock() callback."
> >
> >        Q: Why do we need to take both device_lock() and the context then?
> 
> The device_lock() enables the non-interference property during normal
> operation. OTOH the context is taken to synchronize against non-normal
> (emergency/panic) operation.
> 
> >        Q: Why the acquired context is not enough?
> 
> Because the context _requires_ disabling preemption during all driver
> activity, which is a major source of interference to the system.
> 
> If you only care about UART drivers and !PREEMPT_RT, then there really
> is not much difference between spinning on a spinlock vs. spinning on a
> context. But for non-UART drivers or PREEMPT_RT, there is a huge
> difference between blocking on device_lock() with preemption enabled vs.
> spinning on a context.

This helped a lot. My brain is not trained to see these RT-specific
effects.


> I think it really helps to see the importance if you assume
> device_lock() is a mutex. (All the printk code must also make this
> assumption.)

>From my POV, the game changer is the RT aspect. I knew that
for graphics console drivers the mutex was a nice-to-have thing.
I did not realize that for RT it was a must-to-have thing.

I mean, I did not realize that nbcon_console_try_acquire() was
not RT friendly. We could also say that nbcon_console_try_acquire()
is not acceptable for RT as the primary synchronization primitive.

> > I wonder if we could be more strigthforward.
> >
> > <my-take>
> > 	/**
> > 	 * @write_thread:
> > 	 *
> > 	 * NBCON callback to write out text in task context.
> > 	 *
> > 	 * This callback must be called only in task context with both
> > 	 * device_lock() and the nbcon console acquired.
> > 	 *
> > 	 * The same rules for console ownership verification and unsafe
> > 	 * sections handling applies as with write_atomic().
> > 	 *
> > 	 * The console ownership handling is necessary for synchronization
> > 	 * against write_atomic() which is synchronized only via the context.
> > 	 *
> > 	 * The device_lock() serializes acquiring the console context
> > 	 * with NBCON_PRIO_NORMAL. It is necessary when the device_lock()
> > 	 * does not disable preemption. The console context locking is
> > 	 * not able to detect a race in the following scenario:
> > 	 *
> > 	 *    1. [Task A] owns a context with NBCON_PRIO_NORMAL on [CPU X]
> > 	 *	 and is scheduled.
> > 	 *
> > 	 *    2. Another context takes over the lock with NBCON_PRIO_EMERGENCY
> > 	 *	 and releases it.
> > 	 *
> > 	 *    3. [Task B] acquires a context with NBCON_PRIO_NORMAL on [CPU X]
> > 	 *	 and is scheduled.
> > 	 *
> > 	 *    4. [Task A] gets running on [CPU X] and see that the console is
> > 	 *	 is still owned by a task on [CPU X] with NBON_PRIO_NORMAL.
> > 	 *	 It can't detect that it is actually owned by another task.
> > 	 */
> > </my-take>
> >
> > My variant describes the purpose of device_lock() quite different way.
> > But I think that this is the real purpose and we must be clear about
> > it.
> 
> I would move that last part (starting with "The device_lock()
> serializes...") into the kerneldoc for device_lock(). It really has
> nothing to do with the write_thread() callback.

I agree that describing the details of the race is not important here.

Well, I would still like to describe the role of device_lock()
for write_kthread(). It would help even me to create a better
mental model ;-)

What about the following?

<proposal-2>
	/**
	 * @write_thread:
	 *
	 * NBCON callback to write out text in task context.
	 *
	 * This callback must be called only in task context with both
	 * device_lock() and the nbcon console acquired with
	 * NBCON_PRIO_NORMAL.
	 *
	 * The same rules for console ownership verification and unsafe
	 * sections handling applies as with write_atomic().
	 *
	 * The console ownership handling is necessary for synchronization
	 * against write_atomic() which is synchronized only via the context.
	 *
	 * The device_lock() provides the primary serialization for operations
	 * on the device. It might be as relaxed (mutex)[*] or as tight
	 * (disabled preemption and interrupts) as needed. It allows
	 * the kthread to operate in the least restrictive mode[**].
	 *
	 * [*] Standalone nbcon_context_try_acquire() is not safe with
	 *     the preemption enabled, see nbcon_owner_matches(). But it
	 *     can be safe when always called in the preemptive context
	 *     under the device_lock().
	 *
	 * [**] The device_lock() makes sure that nbcon_context_try_acquire()
	 *      would never need to spin which is important especially with
	 *      PREEMPT_RT.
	 */
</proposal-2>


> > Sigh, I was not able to describe the race reasonably a shorter
> > way. Maybe, we should move this detailed explanation above,
> > nbcon_context_try_acquire() and use just a reference here.
> 
> How about putting it as kerneldoc for nbcon_owner_matches() and refer to
> it there from the device_lock() kerneldoc (or vice versa)?

Yup. nbcon_owner_matches() looks like the right place for details
about the possible race.

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3995,6 +3995,8 @@ static int unregister_console_locked(struct console *console)
>  	if (res > 0)
>  		return 0;
>  
> +	__pr_flush(console, 1000, true);
> +
>  	/* Disable it unconditionally */
>  	console_srcu_write_flags(console, console->flags & ~CON_ENABLED);

Makes sense.

And it actually means that it is too late to flush messages
when kthread_should_stop() returns true. So, it does not matter if
we check it in the while(backlog) loop or not. Well, it might be
worth a comment in the code.

> >> +	kt = kthread_run(nbcon_kthread_func, con, "pr/%s%d", con->name, con->index);
> >> +	if (IS_ERR(kt)) {
> >> +		con_printk(KERN_ERR, con, "failed to start printing thread\n");
> >> +		return;
> >
> > I am a bit surprised that we ignore the error here.
> >
> >
> > Hmm, it likely works in this patch because the legacy code would still
> > flush the console when con->thread is not assigned.
> >
> > But it would stop working later when the legacy loop is disabled
> > by the global @printk_threads_enabled variable.
> 
> The thread failing to start is a serious issue. Particularly for
> PREEMPT_RT.

I agree.

> Probably it should be something like:
> 
> 	if (WARN_ON(IS_ERR(kt))) {

Might make sense.

> In fact, I nbcon_atomic_flush_pending_con() and nbcon_device_release()
> need to check @printk_threads_enabled _instead_ of con->kthread. Once
> "threading mode" has been activated, there should be _no_ atomic
> printing during normal operation.

OK, we have two possibilities when a kthread could not get started:

1. printk() could fallback to flushing messages via write_atomic()
   in the legacy console_unlock() loop.

   It might break RT guarantees but people would see the messages.


2. The affected console would flush the messages only in emergency
   or panic mode.

   This would preserve RT guarantees. And printk() would still be
   able to show emergency and panic messages, including the WARN_ON()
   about that the printk kthread did not start.


It seems that you prefer the 2nd variant. But you are RT-biased ;-)
I would personally prefer the 1st variant. But I am printk-biased ;-)

Honestly, if the system is not able to start the kthread then
it is probably useless anyway. I would prefer if printk keeps working
so that people know what is going on ;-)

> > The kthread is running and started processing messages at this moment.
> > But con->kthread is not set yet.
> 
> [...]
> 
> > It might be worth a comment.
> >
> > <proposal>
> > 	/*
> > 	 * Some users check con->kthread to decide whether to flush
> > 	 * the messages directly using con->write_atomic(). Note that
> > 	 * possible races with the kthread are double prevented.
> > 	 *
> > 	 * First, the users ignore this console until it is added into
> > 	 * @console_list which is done under con->device_lock().
> > 	 * Second, the calls would be additionaly serialized by acquiring
> > 	 * the console context.
> > 	 */
> > </proposal>
> 
> It is enough to mention only the calling context. How about:
> 
> 	/*
> 	 * Some users check con->kthread to decide whether to flush the
> 	 * messages directly using con->write_atomic(). A possible race
> 	 * with the kthread is prevented because all printing is
> 	 * serialized by acquiring the console context.
> 	 */

I am not sure what is more important. In fact, all these users
check con->kthread only when the console is on @console_list.
So, maybe the note about #console_list is more useful.

BTW: uart_port_lock() allows access to the device and it acquires
the context only when the console appears on @console_list.
Well, the access is serialized by the device_lock.

> >> +	con->kthread = kt;

After all, I would add two comments, like these:

<proposal-2>
	/*
	 * Any access to the console device is serialized either by
	 * device_lock() or console context or both.
	 */
	kt = kthread_run(nbcon_kthread_func, con, "pr/%s%d", con->name, con->index);
[...]

	/*
	 * Some users check con->kthread to decide whether to flush
	 * the messages directly using con->write_atomic(). But they
	 * do so only when the console is already in @console_list.
	 */
	 con->kthread = kt;
</proposal-2>

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ