linux-kernel - how to flush consoles: was: Re: [PATCH] printk: nbcon_atomic_flush_pending() is safe only when there is no boot console

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zn7b8g1HtuTIAwyi@pathway.suse.cz>
Date: Fri, 28 Jun 2024 17:51:14 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: how to flush consoles: was: Re: [PATCH] printk:
 nbcon_atomic_flush_pending() is safe only when there is no boot console

On Tue 2024-06-25 22:59:34, John Ogness wrote:
> On 2024-06-13, Petr Mladek <pmladek@...e.com> wrote:
> > Boot consoles are not serialized with the nbcon consoles via the nbcon
> > console context or con->device_lock(). The serialization is possible only
> > via the legacy console_lock().
> >
> > The decision whether nbcon_atomic_flush_pending() should and can be
> > called safely is similar and closely related to the decision
> > whether the legacy loop should be used.
> >
> > Define printing_via_context_safe symmetrically with printing_via_unlock.
> > Allow to call nbcon_atomic_flush_pending() only when it is needed and safe.
> 
> This patch, along with other comments you made about the many different
> checks to see when it is allowed to do what, forced me to take a step
> back and look at the big picture. Indeed, by the end of this series we
> have many boolean variables that influence decisions about how to handle
> threads and how to print.
> 
> I am thinking it makes more sense to incorporate these booleans into a
> single variable (printk_state?). The variable would only be changed
> under the console_list_lock, which could allow lockless users to
> leverage the console_srcu_read_lock for consistency.
> 
> The different orthogonal bits of the variable would be:
> 
> 
> * have_boot_console - true if a boot console is registered
> 
> * have_legacy_console - true if a non-nbcon console is registered
> 
> * have_nbcon_console - true if an nbcon console is registered
> 
> * printk_threads_allowed - true when it is OK to have threads
>   (set/cleared by notifiers, and more or less represents
>   "system_state == SYSTEM_SCHEDULING || system_state == SYSTEM_RUNNING")
> 
> * printk_threads_available - true while all printing threads are running
>   (and implies at least 1 thread is running)

I would call the variable "printk_threads_running". It seems to have
more clear meaning.

12th patch adds:

  * force_printkthread - which triggers offloading even the legacy
	into a thread. Or does it force using kthreads even in
	emergency?


> We could provide macros to interpret this multi-flag value for various
> scenarios:
> 
> #define nbcon_may_create_threads()
>   (printk_threads_allowed)

The force_printkthread involves also legacy consoles. As a result,
the "nbcon_" prefix does not look important.

> #define nbcon_may_rely_on_threads()
>   (have_nbcon_console && !have_boot_console && printk_threads_available)
> 
> #define nbcon_may_flush_atomic()
>   (have_nbcon_console && !have_boot_console)
> 
> #define nbcon_must_flush_atomic()
>   (have_nbcon_console && !have_boot_console && !printk_threads_available)
> 
> #define nbcon_must_flush_via_unlock()
>   (have_nbcon_console && have_boot_console)
> 
> #define printing_via_unlock()
>   (have_legacy_console || have_boot_console)

I like this "may" vs "must" variants.

Good names and good macros might help, definitely. But I think that
there are more problems. There are also several variants of functions
flushing the consoles. Are they are orthogonal as well:

One aspect are the console callbacks:

    + write()		# legacy consoles
    + write_atomic()    # nbcon consoles in atomic or unknown context
    + write_thread      # nbcon consoles in task context

Another aspect is the serialization:

    + console_lock()			  # legacy loop
    + con->driver_lock() + nbcon_context  # nbcon consoles in normal context
    + nbcon_context			  # nbcon consoles in emergency
					  # or panic context
Caller:

    + printk()
    + explicit flush()

Context priority:

    + normal
    + emergency
    + panic


I still have to think if we could somehow improve the situation,
for example, by using some systematic names.

Well, the conditions are more or less straightforward in most
situations with three exceptions:

   + vprintk_emit()
   + nbcon_cpu_emergency_exit()
   + nbcon_cpu_emergency_flush()

, where:

  +  The rules for the direct flush are the same in both
     nbcon_cpu_emergency_exit() and nbcon_cpu_emergency_flush().

     But only nbcon_cpu_emergency_exit() calls the trigger-offload
     part.


  +  The rules in vprintk_emit() are much more complicated
     by the special handling in:

       + NBCON_PRIO_PANIC and legacy_allow_panic_sync
       + NBCON_PRIO_EMERGENCY

IMHO, it would help to:

  + avoid code duplication in nbcon_cpu_emergency_exit()/flush()

  + make the code in vprintk_emit() as compact as possible,
    especially avoid updating "do_trylock_unlock" all over

  + make the code in vprintk_emit as similar to
    nbcon_cpu_emergency_exit()/flush() as possible.


I have tried various variants and it always became complicated.
So I decided to make the logic as compact as possible in the
following two functions:


/**
 * struct console_flush_type - Define how to flush the consoles.
 * @nbcon_atomic: Flush directly using nbcon_atomic() callback
 * @nbcon_thread: Offload the flush to the kthread
 * @legacy_direct: Call the legacy loop in this context
 * @legacy_offload: Offload the legacy loop into IRQ or kthread
 */
struct console_flush_type {
	bool	nbcon_atomic;
	bool	nbcon_thread;
	bool	legacy_direct;
	bool	legacy_offload;
};

/*
 * Decide how the messages should get flushed from printk().
 * It is supposed to be called in vprintk_emit()
 *
 * Variant per console type
 */
void printk_set_console_flush_type(struct console_flush *flush)
{
	enum nbcon_prio nbcon_prio;

	memset(flush, 0, sizeof(*flush));

	/*
	 * nbcon_get_default_prio() can be read safely even in premptible
	 * context. NBCON_PRIO_PANIC is used only on panic-CPU.
	 * NBCON_PRIO_EMERGENCY is set only in context with CPU migragtion
	 * disabled.
	 */
	nbcon_prio = nbcon_get_default_prio();

	/*
	 * Skip it in EMERGENCY priority. The console will be
	 * explicitly flushed when exiting the emergency section.
	 */
	 if (nbcon_prio == NBCON_PRIO_EMERGENCY)
			return;

	/* How to flush nbcon consoles without legacy loop. */
	if (have_nbcon_console && !have_boot_console) {
		if (nbcon_prio == NBCON_PRIO_PANIC) {
			flush->nbcon_atomic = true;

			/* In panic, the legacy consoles are not allowed
			 * to print from the printk calling context unless
			 * explicitly allowed. This gives the safe nbcon
			 * consoles a chance to print out all the panic
			 * messages first. This restriction only applies
			 * if there are nbcon consoles registered.
			 */
			if (!legacy_allow_panic_sync)
				return;

		/* Only NBCON_PRIO_NORMAL left. */
		} else if (nbcon_kthreads_running) {
			flush->nbcon_thread = true;
		} else {
			flush->nbcon_atomic = true;
		}
	}

	/* How to do the legacy loop. */
	if (have_legacy_console || have_boot_console) {
		if (nbcon_prio == NBCON_PRIO_PANIC &&
		    legacy_allow_panic_sync) {
			flush->nbcon_direct = true;

		/* Only NBCON_PRIO_NORMAL left. */
		} if (is_printk_deferred() || console_legacy_thread)) {
			flush->legacy_thread = true;
		} else {
			flush->legacy_direct = true;
		}
	}
}

/*
 * Decide how the messages should get flushed from printk().
 * It is supposed to be called in vprintk_emit()
 *
 * Variant per nbcon prio
 */
void printk_set_console_flush_type(struct console_flush *flush)
{
	enum nbcon_prio nbcon_prio;

	memset(flush, 0, sizeof(*flush));

	/*
	 * nbcon_get_default_prio() can be read safely even in premptible
	 * context. NBCON_PRIO_PANIC is used only on panic-CPU.
	 * NBCON_PRIO_EMERGENCY is set only in context with CPU migragtion
	 * disabled.
	 */
	nbcon_prio = nbcon_get_default_prio();

	switch(nbcon_prio):
	NBCON_PRIO_NORMAL:
		if (have_nbcon_console && !have_boot_console) {
			if (nbcon_kthreads_running)
				flush->nbcon_thread = true;
			else
				flush->nbcon_atomic = true;
		}

		if (have_legacy_console || have_boot_console ) {
			if (is_printk_deferred() || console_legacy_thread))
				flush->legacy_thread = true;
			else
				flush->legacy_direct = true;
		}
		break;

	NBCON_PRIO_EMERGENCY:
		/*
		 * Skip it in EMERGENCY priority. The console will be
		 * explicitly flushed when exiting the emergency section.
		 */
		 break;

	NBCON_PRIO_PANIC:
		if (have_nbcon_console && !have_boot_console) {
			flush->nbcon_atomic = true;

			/* In panic, the legacy consoles are not allowed
			 * to print from the printk calling context unless
			 * explicitly allowed. This gives the safe nbcon
			 * consoles a chance to print out all the panic
			 * messages first. This restriction only applies
			 * if there are nbcon consoles registered.
			 */
			if (!legacy_allow_panic_sync)
				break;
		}

		if ((have_legacy_console || have_boot_console) &&
			flush->nbcon_direct = true;

		break;
	default:
		WARN_ON_ONCE(1, "This should never happen\n");
	}
}

/*
 * Decide how the messages should get flushed from emergency context.
 * This is called when we really want to flush the emergency messages.
 *
 * FIXME: Emergency messages should always get flushed directly, except
 *        when it is not safe.
 */
void get_console_emergency_flush_type(struct console_flush_type *flush)
{
	enum nbcon_prio nbcon_prio = nbcon_get_default_prio();

	memset(flush, 0, sizeof(*flush));

	WARN_ON_ONCE(nbcon_prio != NBCON_PRIO_EMERGENCY);

	if (have_nbcon_console && !have_boot_console)
		flush->nbcon_atomic = true;


	if (have_legacy_console || have_boot_console) {
		/* FIXME: add force_legacy_kthread? */
		if (is_printk_deferred())
			flush->legacy_thread = true;
		} else {
			flush->legacy_direct = true;
		}
	}
}

The first function might be used in vprintk_emit as:

asmlinkage int vprintk_emit(int facility, int level,
			    const struct dev_printk_info *dev_info,
			    const char *fmt, va_list args)
{
	struct console_flush_type flush;
	bool do_trylock_unlock = !force_printkthreads() &&
				 printing_via_unlock;
	int printed_len;

	/* Suppress unimportant messages after panic happens */
	if (unlikely(suppress_printk))
		return 0;

	/*
	 * The messages on the panic CPU are the most important. If
	 * non-panic CPUs are generating any messages, they will be
	 * silently dropped.
	 */
	if (other_cpu_in_panic())
		return 0;

	if (level == LOGLEVEL_SCHED) {
		level = LOGLEVEL_DEFAULT;
		/* If called from the scheduler, we can not call up(). */
		do_trylock_unlock = false;
	}

	printk_delay(level);

	printed_len = vprintk_store(facility, level, dev_info, fmt, args);

	printk_set_console_flush_type(&flush);

	if (flush.nbcon_direct)
		nbcon_atomic_flush_pending();

	if (flush.nbcon_thread)
		nbcon_wake_threads();

	if (flush.legacy_direct) {
		/*
		 * The caller may be holding system-critical or
		 * timing-sensitive locks. Disable preemption during
		 * printing of all remaining records to all consoles so that
		 * this context can return as soon as possible. Hopefully
		 * another printk() caller will take over the printing.
		 *
		 * Also, nbcon_get_default_prio() requires migration disabled.
		 */
		preempt_disable();

		/*
		 * Try to acquire and then immediately release the console
		 * semaphore. The release will print out buffers. With the
		 * spinning variant, this context tries to take over the
		 * printing from another printing context.
		 */
		if (console_trylock_spinning())
			console_unlock();

		preempt_enable();
	}

	if (flush.legacy_thread)
		defer_console_output();
	else
		wake_up_klogd();

	return printed_len;
}
EXPORT_SYMBOL(vprintk_emit);


I do not say that this is the best solution.

My view:

1. I like the 2nd variant of printk_set_console_flush_type() where the
   code is primary split by nbcon_prio.

   And I like that the rather complicated rules are as compact as
   possible and quite clear for each scenario.


2. The situation is much easier in get_console_emergency_flush_type().

   Note that it never sets "nbcon_thread". Maybe, we could flush the
   consoles in this function. And just return whether it is needed
   to offload the legacy loop.


Best Regards,
Petr