linux-kernel - preffer_ofload param: was: Re: [PATCH printk v3 11/19] printk: nbcon: Rely on kthreads for normal operation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZqpAJgKeB0cIlTg7@pathway.suse.cz>
Date: Wed, 31 Jul 2024 15:46:14 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: preffer_ofload param: was: Re: [PATCH printk v3 11/19] printk:
 nbcon: Rely on kthreads for normal operation

On Mon 2024-07-22 19:25:31, John Ogness wrote:
> Once the kthread is running and available
> (i.e. @printk_kthreads_running is set), the kthread becomes
> responsible for flushing any pending messages which are added
> in NBCON_PRIO_NORMAL context. Namely the legacy
> console_flush_all() and device_release() no longer flush the
> console. And nbcon_atomic_flush_pending() used by
> nbcon_cpu_emergency_exit() no longer flushes messages added
> after the emergency messages.
> 
> --- a/kernel/printk/internal.h
> +++ b/kernel/printk/internal.h
> @@ -190,11 +190,13 @@ extern bool legacy_allow_panic_sync;
>  /**
>   * struct console_flush_type - Define how to flush the consoles
>   * @nbcon_atomic:	Flush directly using nbcon_atomic() callback
> + * @nbcon_offload:	Offload flush to printer thread
>   * @legacy_direct:	Call the legacy loop in this context
>   * @legacy_offload:	Offload the legacy loop into IRQ
>   */
>  struct console_flush_type {
>  	bool	nbcon_atomic;
> +	bool	nbcon_offload;
>  	bool	legacy_direct;
>  	bool	legacy_offload;
>  };
> @@ -220,7 +222,9 @@ static inline void printk_get_console_flush_type(struct console_flush_type *ft,
>  				ft->legacy_direct = true;
>  		}
>  
> -		if (have_nbcon_console && !have_boot_console)
> +		if (printk_kthreads_running)
> +			ft->nbcon_offload = true;
> +		else if (have_nbcon_console && !have_boot_console)
>  			ft->nbcon_atomic = true;
>  		break;
>  
> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
> index 233ab8f90fef..8cf9e9e8c6e4 100644
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1511,10 +1511,10 @@ static void nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
>  
>  	/*
>  	 * If flushing was successful but more records are available, this
> -	 * context must flush those remaining records because there is no
> -	 * other context that will do it.
> +	 * context must flush those remaining records if the printer thread
> +	 * is not available do it.
>  	 */
> -	printk_get_console_flush_type(&ft, false);
> +	printk_get_console_flush_type(&ft, true);

Hmm, it is a bit weird that we change the value even though it does
not affect the behavior. The parameter @prefer_offload affects only
the legacy loop.

It is even more confusing because ft.legacy_offload and
ft.nbcon_offload have difference motivation:

  + legacy_offload is used when the direct flush is not allowed/possible
  + nbcon_offload is used when allowed/possible

I am not 100% sure why I added the @prefer_offload parameter. I think
that it was for situations when we wanted to call the legacy loop
independently on whether direct or offload was chosen. I think
that it was for __wake_up_klogd() called from
defer_console_output().

Anyway, this is cscope output in emacs after applying this patchset:

<paste>
Finding symbol: printk_get_console_flush_type

Database directory: /prace/kernel/linux-printk/
-------------------------------------------------------------------------------
*** kernel/printk/internal.h:
printk_get_console_flush_type[225] static inline void printk_get_console_flush_type(struct console_flush_type *ft, bool prefer_offload)

*** kernel/printk/nbcon.c:
nbcon_atomic_flush_pending_con[1548] printk_get_console_flush_type(&ft, true);
nbcon_device_release[1848]     printk_get_console_flush_type(&ft, true);

*** kernel/printk/printk.c:
printk_legacy_allow_panic_sync[2343] printk_get_console_flush_type(&ft, false);
vprintk_emit[2381]             printk_get_console_flush_type(&ft, false);
resume_console[2706]           printk_get_console_flush_type(&ft, true);
console_cpu_notify[2729]       printk_get_console_flush_type(&ft, false);
console_flush_all[3102]        printk_get_console_flush_type(&ft, true);
console_unlock[3223]           printk_get_console_flush_type(&ft, false);
console_flush_on_panic[3375]   printk_get_console_flush_type(&ft, false);
console_start[3453]            printk_get_console_flush_type(&ft, true);
legacy_kthread_should_wakeup[3484] printk_get_console_flush_type(&ft, true);
__pr_flush[4290]               printk_get_console_flush_type(&ft, false);
__pr_flush[4302]               printk_get_console_flush_type(&ft, false);
__wake_up_klogd[4466]          printk_get_console_flush_type(&ft, true);
console_try_replay_all[4851]   printk_get_console_flush_type(&ft, true);
-------------------------------------------------------------------------------
</paste>

Now, the parameter @prefer_offload makes a difference only
when it is set to "true" and "ft.legacy_offload" value is
later used. It reduces the above list to:

*** kernel/printk/printk.c:
resume_console[2706]           printk_get_console_flush_type(&ft, true);
console_start[3453]            printk_get_console_flush_type(&ft, true);
__wake_up_klogd[4466]          printk_get_console_flush_type(&ft, true);
console_try_replay_all[4851]   printk_get_console_flush_type(&ft, true);

IMHO, __wake_up_klogd() is the only location where we really need
to know if there are any messages for the legacy loop, for example,
when called from printk_deferred().

It should not be needed in other situations because it there
is always __pr_flush() or console_unlock() which would flush
the legacy consoles directly anyway.

=> I suggest to

1. Remove @prefer_offload parameter from printk_get_console_flush_type

2. Update __wake_up_klogd() to check both ft.legacy_offload and
   ft.legacy_direct, like:

	printk_get_console_flush_type(&ft);
	if (!ft.legacy_offload && !ft.legacy_direct)
		val &= ~PRINTK_PENDING_OUTPUT;


NOTE: I actually suggested to use in vprintk_emit():

	printk_get_console_flush_type(&ft, deffer_legacy);

      But it can be done even without this parameter. Like it
      is done in this version of the patchset.

Best Regards,
Petr