linux-kernel - Re: [PATCH RFC 2/2] printk: Use console_flush_one

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMwy7pFM7EExLxaX@pathway.suse.cz>
Date: Thu, 18 Sep 2025 18:27:26 +0200
From: Petr Mladek <pmladek@...e.com>
To: Andrew Murray <amurray@...goodpenguin.co.uk>
Cc: Steven Rostedt <rostedt@...dmis.org>,
	John Ogness <john.ogness@...utronix.de>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 2/2] printk: Use console_flush_one_record for legacy
 printer kthread

On Mon 2025-09-15 13:43:06, Andrew Murray wrote:
> The legacy printer kthread uses console_lock and
> __console_flush_and_unlock to flush records to the console. This
> approach results in the console_lock being held for the entire
> duration of a flush. This can result in large waiting times for
> those waiting for console_lock especially where there is a large
> volume of records or where the console is slow (e.g. serial). This
> contention is observed during boot, as the call to filp_open in
> console_on_rootfs will delay progression to userspace until any
> in-flight flush is completed.

It would be great to add here the boot logs from the cover letter
so that the real life numbers are stored in the git log.

> Let's instead use __console_flush_unlocked which releases and
> reacquires console_lock between records.

It seems that the patch does the right thing.
I would just suggest some refactoring, see below.

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3301,6 +3301,46 @@ static bool console_flush_all(bool do_cond_resched, u64 *next_seq, bool *handove
>  	return any_usable;
>  }
>  
> +/*
> + * Print out all remaining records to all consoles.
> + *
> + * @next_seq is set to the sequence number after the last available record.
> + * The value is valid only when this function returns true. It means that all
> + * usable consoles are completely flushed.
> + *
> + * @handover will be set to true if a printk waiter has taken over the
> + * console_lock, in which case the caller is no longer holding the
> + * console_lock. Otherwise it is set to false.
> + *
> + * Returns true when there was at least one usable console and all messages
> + * were flushed to all usable consoles. A returned false informs the caller
> + * that everything was not flushed (either there were no usable consoles or
> + * another context has taken over printing or it is a panic situation and this
> + * is not the panic CPU). Regardless the reason, the caller should assume it
> + * is not useful to immediately try again.
> + */
> +static bool console_flush_all_unlocked(u64 *next_seq, bool *handover)
> +{
> +	bool any_usable;
> +	bool any_progress;
> +
> +	*next_seq = 0;
> +	*handover = false;
> +
> +	do {
> +		console_lock();
> +		any_progress = console_flush_one_record(true, next_seq, handover, &any_usable);
> +
> +		if (*handover)
> +			return false;
> +
> +		__console_unlock();
> +
> +	} while (any_progress);
> +
> +	return any_usable;
> +}

This is yet another console_flush() API with a complicated semantic.
(return value, parameters). It is not bad. But it would be nice
if we did not need it ;-)

>  static void __console_flush_and_unlock(void)
>  {
>  	bool do_cond_resched;
> @@ -3346,6 +3386,17 @@ static void __console_flush_and_unlock(void)
>  	} while (prb_read_valid(prb, next_seq, NULL) && console_trylock());
>  }
>  
> +static void __console_flush_unlocked(void)
> +{
> +	bool handover;
> +	bool flushed;
> +	u64 next_seq;
> +
> +	do {
> +		flushed = console_flush_all_unlocked(&next_seq, &handover);
> +	} while (flushed && !handover && prb_read_valid(prb, next_seq, NULL));
> +}

The semantic of this function is not much clear from the name.
IMHO, it would be a puzzle for people who would try to understand
the code in the future.

> +
>  /**
>   * console_unlock - unblock the legacy console subsystem from printing
>   *
> @@ -3676,8 +3727,7 @@ static int legacy_kthread_func(void *unused)
>  		if (kthread_should_stop())
>  			break;
>  
> -		console_lock();
> -		__console_flush_and_unlock();
> +		__console_flush_unlocked();

IMHO, it would be pretty hard to understand what it does by trying to
understand all the layers of the code.

It might be better to open code it. And I would take inspiration
in the nbcon_kthread_func().

I played with the code and came up with:

static int legacy_kthread_func(void *unused)
{
	bool any_progress;

wait_for_event:
	wait_event_interruptible(legacy_wait, legacy_kthread_should_wakeup());

	do {
		bool any_usable;
		bool handover;
		u64 next_seq;

		if (kthread_should_stop())
			return 0;

		console_lock();
		any_progress = console_flush_one_record(true, &next_seq, &handover, &any_usable);
		if (!handover)
			__console_unlock();

		/*
		 * There is no need to check whether there is any usable
		 * console in this context or whether there are pending
		 * messages. It is checked by legacy_kthread_should_wakeup()
		 * anyway. And console_lock() will never succeed again if
		 * there was panic in progress.
		 */
	} while (any_progress);

	goto wait_for_event;
}

What do you think, please?

Best Regards,
Petr