lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZrtjXChY_0wnFXsS@pathway.suse.cz>
Date: Tue, 13 Aug 2024 15:45:22 +0200
From: Petr Mladek <pmladek@...e.com>
To: takakura@...inux.co.jp
Cc: rostedt@...dmis.org, john.ogness@...utronix.de,
	senozhatsky@...omium.org, akpm@...ux-foundation.org, bhe@...hat.com,
	lukas@...ner.de, wangkefeng.wang@...wei.com, ubizjak@...il.com,
	feng.tang@...el.com, j.granados@...sung.com,
	stephen.s.brennan@...cle.com, linux-kernel@...r.kernel.org,
	nishimura@...inux.co.jp, taka@...inux.co.jp
Subject: Re: [PATCH v3 2/2] Handle flushing of CPU backtraces during panic

On Mon 2024-08-12 16:29:31, takakura@...inux.co.jp wrote:
> From: Ryo Takakura <takakura@...inux.co.jp>
> 
> After panic, non-panicked CPU's has been unable to flush ringbuffer
> while they can still write into it. This can affect CPU backtrace
> triggered in panic only able to write into ringbuffer incapable of
> flushing them.

I still think about this. The motivation is basically the same
as in the commit 5d5e4522a7f404d1a96fd ("printk: restore flushing
of NMI buffers on remote CPUs after NMI backtraces").

It would make sense to replace printk_trigger_flush() with
console_try_flush(). And the function should queue the irq
work when it can't be flushed directly, see below.

> Fix the issue by letting the panicked CPU handle the flushing of
> ringbuffer right after non-panicked CPUs finished writing their
> backtraces.
> 
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -260,6 +260,7 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
>  		panic_triggering_all_cpu_backtrace = true;
>  		trigger_all_cpu_backtrace();
>  		panic_triggering_all_cpu_backtrace = false;
> +		console_try_flush();
>  	}
>  
>  	/*
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3284,6 +3284,20 @@ void console_flush_on_panic(enum con_flush_mode mode)
>  	console_flush_all(false, &next_seq, &handover);
>  }
>  
> +/**
> + * console_try_flush - try to flush consoles when safe
> + *
> + * Context: Any, except for NMI.

It is safe even in NMI. is_printk_legacy_deferred() would return true
in this case.

> + */
> +void console_try_flush(void)
> +{
> +	if (is_printk_legacy_deferred())
> +		return;
> +
> +	if (console_trylock())
> +		console_unlock();
> +}

I would do something like:

/**
 * console_try_or_trigger_flush - try to flush consoles directly when
 *	safe or the trigger deferred flush.
 *
 * Context: Any
 */
void console_try_or_trigger_flush(void)
{
	if (!is_printk_legacy_deferred() && console_trylock())
		console_unlock();
	else
		defer_console_output();
}

and use it instead of printk_trigger_flush() in
nmi_trigger_cpumask_backtrace().


Well, I would postpone this patch after we finalize the patchset
adding con->write_atomic() callback. This patch depends on it anyway
via is_printk_legacy_deferred(). The patchset might also add
other wrappers for flushing consoles and we have to choose some
reasonable names. Or John could integrate this patch into the patchset.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ