linux-kernel - Re: [PATCH printk v2 24/26] panic: Mark emergency section in oops

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZeHsaU4CbwJSEOtG@alley>
Date: Fri, 1 Mar 2024 15:55:37 +0100
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Peter Zijlstra (Intel)" <peterz@...radead.org>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	"Guilherme G. Piccoli" <gpiccoli@...lia.com>,
	Arnd Bergmann <arnd@...db.de>,
	Kefeng Wang <wangkefeng.wang@...wei.com>,
	Uros Bizjak <ubizjak@...il.com>
Subject: Re: [PATCH printk v2 24/26] panic: Mark emergency section in oops

On Sun 2024-02-18 20:03:24, John Ogness wrote:
> Mark an emergency section beginning with oops_enter() until the
> end of oops_exit(). In this section, the CPU will not perform
> console output for the printk() calls. Instead, a flushing of the
> console output is triggered when exiting the emergency section.
> 
> The very end of oops_exit() performs a kmsg_dump(). This is not
> included in the emergency section because it is another
> flushing mechanism that should occur after the consoles have
> been triggered to flush.
> 
> Signed-off-by: John Ogness <john.ogness@...utronix.de>
> ---
>  kernel/panic.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/panic.c b/kernel/panic.c
> index d30d261f9246..9fa44bc38f46 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -634,6 +634,7 @@ bool oops_may_print(void)
>   */
>  void oops_enter(void)
>  {
> +	nbcon_cpu_emergency_enter();
>  	tracing_off();
>  	/* can't trust the integrity of the kernel anymore: */
>  	debug_locks_off();
> @@ -656,6 +657,7 @@ void oops_exit(void)
>  {
>  	do_oops_enter_exit();

The comment above oops_enter() function says:

/*
 * Called when the architecture enters its oops handler, before it prints
 * anything.  If this is the first CPU to oops, and it's oopsing the first
 * time then let it proceed.
 *
 * This is all enabled by the pause_on_oops kernel boot option.  We do all
 * this to ensure that oopses don't scroll off the screen.  It has the
 * side-effect of preventing later-oopsing CPUs from mucking up the display,
 * too.
 *
 * It turns out that the CPU which is allowed to print ends up pausing for
 * the right duration, whereas all the other CPUs pause for twice as long:
 * once in oops_enter(), once in oops_exit().
 */

and indeed do_oops_enter_exit(); does the waiting.

IMHO, we should enter() the emergency context after waiting in
oops_enter(). And exit() it before waiting in oops_exit(). Aka


 void oops_enter(void)
 {
 	tracing_off();
 	/* can't trust the integrity of the kernel anymore: */
 	debug_locks_off();
 	do_oops_enter_exit();
+ 	nbcon_cpu_emergency_enter();
 
 	if (sysctl_oops_all_cpu_backtrace)
 		trigger_all_cpu_backtrace();
 }

 void oops_exit(void)
 {
+	nbcon_cpu_emergency_exit();
 	do_oops_enter_exit();
 	print_oops_end_marker();
 	kmsg_dump(KMSG_DUMP_OOPS);
 }


>  	print_oops_end_marker();
> +	nbcon_cpu_emergency_exit();
>  	kmsg_dump(KMSG_DUMP_OOPS);
>  }

Otherwise, it looks good.

Best Regards,
Petr