lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87pmoh3yf9.fsf@jogness.linutronix.de>
Date:   Mon, 24 Jan 2022 17:18:42 +0106
From:   John Ogness <john.ogness@...utronix.de>
To:     Stephen Brennan <stephen.s.brennan@...cle.com>,
        Petr Mladek <pmladek@...e.com>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>
Cc:     Stephen Brennan <stephen.s.brennan@...cle.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 4/4] printk: Drop console_sem during panic

On 2022-01-21, Stephen Brennan <stephen.s.brennan@...cle.com> wrote:
> If another CPU is in panic, we are about to be halted. Try to gracefully
> drop console_sem and allow the panic CPU to grab it easily.
>
> Suggested-by: Petr Mladek <pmladek@...e.com>
> Signed-off-by: Stephen Brennan <stephen.s.brennan@...cle.com>
> ---
>  kernel/printk/printk.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index ca253ac07615..c2dc8ebd9509 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2668,7 +2668,7 @@ void console_unlock(void)
>  
>  	for (;;) {
>  		size_t ext_len = 0;
> -		int handover;
> +		int handover, pcpu;
>  		size_t len;
>  
>  skip:
> @@ -2739,6 +2739,12 @@ void console_unlock(void)
>  		if (handover)
>  			return;
>  
> +		/* Allow panic_cpu to take over the consoles safely */
> +		pcpu = atomic_read(&panic_cpu);
> +		if (unlikely(pcpu != PANIC_CPU_INVALID &&
> +		    pcpu != raw_smp_processor_id()))
> +			break;
> +

Keep in mind that after the "break", this context will try to re-acquire
the console lock and continue printing. That is a pretty small window
for the panic CPU to attempt a trylock.

Perhaps the retry after the loop should also be avoided for non-panic
CPUs. This would rely on the panic CPU taking over (as your comment
suggests will happen). Since the panic-CPU calls pr_emerg() as the final
record before drifting off to neverland, that is probably OK.

Something like:

@@ -2731,7 +2731,8 @@ void console_unlock(void)
 	 * there's a new owner and the console_unlock() from them will do the
 	 * flush, no worries.
 	 */
-	retry = prb_read_valid(prb, next_seq, NULL);
+	retry = (pcpu != raw_smp_processor_id()) &&
+		prb_read_valid(prb, next_seq, NULL);
 	if (retry && console_trylock())
 		goto again;
 }

I would also like to see a comment about why it is acceptable to use
raw_smp_processor_id() in a context that has migration
enabled. Something like: raw_smp_processor_id() can be used because this
context cannot be migrated to the panic CPU.

John Ogness

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ