linux-kernel - Re: [PATCH] panic: disable optimistic spin after halting CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a6fzxwak.fsf@stepbren-lnx.us.oracle.com>
Date:   Thu, 13 Jan 2022 11:36:35 -0800
From:   Stephen Brennan <stephen.s.brennan@...cle.com>
To:     Petr Mladek <pmladek@...e.com>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        John Ogness <john.ogness@...utronix.de>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] panic: disable optimistic spin after halting CPUs

Hi Petr,

Thanks for taking a look!

Petr Mladek <pmladek@...e.com> writes:
> On Wed 2022-01-12 16:54:25, Stephen Brennan wrote:
>> A CPU executing with console lock spinning enabled might be halted
>> during a panic. Before console_flush_on_panic(), it's possible for
>> console_trylock() to attempt optimistic spinning, deadlocking the panic
>> CPU:
>> 
>> CPU 0 (panic CPU)             CPU 1
>> -----------------             ------
>>                               printk() {
>>                                 vprintk_func() {
>>                                   vprintk_default() {
>>                                     vprintk_emit() {
>>                                       console_unlock() {
>>                                         console_lock_spinning_enable();
>>                                         ... printing to console ...
>> panic() {
>>   crash_smp_send_stop() {
>>     NMI  -------------------> HALT
>>   }
>>   atomic_notifier_call_chain() {
>>     printk() {
>>       ...
>>       console_trylock_spinnning() {
>>         // optimistic spin infinitely
>
> I see.
>
>> This hang during panic can be induced when a kdump kernel is loaded, and
>> crash_kexec_post_notifiers=1 is present on the kernel command line. The
>> following script which concurrently writes to /dev/kmsg, and triggers a
>> panic, can result in this hang:
>> 
>>     #!/bin/bash
>>     date
>>     # 991 chars (based on log buffer size):
>>     chars="$(printf 'a%.0s' {1..991})"
>>     while :; do
>>         echo $chars > /dev/kmsg
>>     done &
>>     echo c > /proc/sysrq-trigger &
>>     date
>>     exit
>> 
>> Below are the hang rates for the above test case, based on v5.16-rc8
>> before and after this patch:
>> 
>> before:  197 hangs / 340 trials - 57.9%
>> after :    0 hangs / 245 trials -  0.0%
>> 
>> Fixes: dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes")
>> 
>> Signed-off-by: Stephen Brennan <stephen.s.brennan@...cle.com>
>> Reviewed-by: Junxiao Bi <junxiao.bi@...cle.com>
>> ---
>>  include/linux/console.h |  1 +
>>  kernel/panic.c          |  7 +++++++
>>  kernel/printk/printk.c  | 17 +++++++++++++++++
>>  3 files changed, 25 insertions(+)
>> 
>> diff --git a/include/linux/console.h b/include/linux/console.h
>> index a97f277cfdfa..4eeb46927d96 100644
>> --- a/include/linux/console.h
>> +++ b/include/linux/console.h
>> @@ -179,6 +179,7 @@ extern void console_unlock(void);
>>  extern void console_conditional_schedule(void);
>>  extern void console_unblank(void);
>>  extern void console_flush_on_panic(enum con_flush_mode mode);
>> +extern void console_lock_spinning_disable_on_panic(void);
>>  extern struct tty_driver *console_device(int *);
>>  extern void console_stop(struct console *);
>>  extern void console_start(struct console *);
>> diff --git a/kernel/panic.c b/kernel/panic.c
>> index cefd7d82366f..a9340e580b20 100644
>> --- a/kernel/panic.c
>> +++ b/kernel/panic.c
>> @@ -265,6 +265,13 @@ void panic(const char *fmt, ...)
>>  		crash_smp_send_stop();
>>  	}
>>  
>> +	/*
>> +	 * Now that we've halted other CPUs, disable optimistic spinning in
>> +	 * printk(). This avoids deadlocking in console_trylock(), until we call
>> +	 * console_flush_on_panic().
>> +	 */
>> +	console_lock_spinning_disable_on_panic();
>
> The proposed implementation causes that the panicing CPU takes over
> console_lock even when the current owner is in the middle of
> call_console_drivers().

I don't think this is true. All that
console_lock_spinning_disable_on_panic() does is clear out the
console_owner and console_owner_lock, signalling that the panicking CPU
should not attempt to spin and wait for the lock.

The only time this matters is when another task already holds
console_sem and is executing presently in call_console_drivers(), as you
said.

With console_owner set to NULL, console_trylock_spinning() will get
spin=false, and thus it will bail out, returning 0. Thus, the panicking
CPU would fail to acquire console_sem, and so in vprintk_emit it will
not end up going into console_unlock(), or doing call_console_drivers().

It would not be until console_flush_on_panic() down below (when there is
no other option than to try to use the console) that the panicking CPU
actually attempts to call the drivers.

>
> It means that calling console drivers from another CPU is not
> completely safe. This is why console_flush_on_panic() is called
> at end when the console is the only way to see the messages.
>
> We should not do this before kmsg_dump() and __crash_kexec()
> that allow to see the messages a "more safe" way.
>
> I suggest to disable the spinning when panic is in progress. I mean
> something like:
>
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1842,6 +1842,10 @@ static int console_trylock_spinning(void)
>  	if (console_trylock())
>  		return 1;
>  
> +	/* Spinning is not safe when the system is stopped */
> +	if (read_atomic(&panic_cpu) != PANIC_CPU_INVALID)
> +		return 0;
> +
>  	printk_safe_enter_irqsave(flags);
>  
>  	raw_spin_lock(&console_owner_lock);

I think this is pretty much equivalent to my fix, since it also results
in the panicking CPU failing to acquire console_sem during
console_trylock_spinning().

I do think this is better than my fix :) since it doesn't clutter up
panic() even more, nor introduce an additional function. It even avoids
resetting the console_owner_lock spinlock.

I'd be happy to do this as a v2, if you'd prefer?

>
> It would be also great when the current owner releases console_lock
> so that the panicing CPU could take over it.
>
> I think about the following. Well, I am not sure if it would help
> in the real life because other CPUs are stopped quite early in panic().
>
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2646,13 +2650,18 @@ void console_unlock(void)
>  
>  	for (;;) {
>  		size_t ext_len = 0;
> -		int handover;
> +		int handover, pcpu;
>  		size_t len;
>  
>  skip:
>  		if (!prb_read_valid(prb, console_seq, &r))
>  			break;
>  
> +		/* Allow the panic_cpu to take over consoles a safe way. */
> +		pcpu = read_atomic(&panic_cpu);
> +		if (pcpu != PANIC_CPU_INVALID && pcpu != smp_processor_id())
> +			break;
> +
>  		if (console_seq != r.info->seq) {
>  			console_dropped += r.info->seq - console_seq;
>  			console_seq = r.info->seq;
>

I suppose this could help reduce the odds of a CPU getting interrupted
during call_console_drivers(), and it might reduce the odds of
console_sem being held by a halted CPU during panic().

However, it occurs to me that there are two cases:

1) a kdump kernel is loaded. In this case, crash_smp_send_stop() is
used, which (at least on x86_64) sends NMIs. In this case,
__crash_kexec() will not return (barring any errors) and we won't ever
really need to get the console_sem.

2) no kdump kernel is loaded. In this case, smp_send_stop() is used,
which (again on x86_64) sends regular IPIs. In this case, we know we can
at least avoid interrupting call_console_drivers(), since that executes
with local IRQ disabled.

So I'm also unsure how much this would help in the real world. But it's
a small change and so it doesn't need to help much in order to be a net
positive :)

Thanks,
Stephen

>
>
> Note that the above code is not even compile tested. panic_cpu is
> local variable in panic.c. If we go this way than I would define
> some helpers:
>
> bool panic_in_progress(void)
> {
> 	return read_atomic(&panic_cpu) != PANIC_CPU_INVALID;
> }
>
> bool panic_in_progress_on_other_cpu(void)
> {
> 	int pcpu = read_atomic(&panic_cpu);
>
> 	if (pcpu == PANIC_CPU_INVALID)
> 		return false;
>
> 	if (pcpu == smp_processor_id())
> 		return false;
>
> 	return true;
> }
>
> Best Regards,
> Petr