lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZrIUUhE_RDxizKcN@pathway.suse.cz>
Date: Tue, 6 Aug 2024 14:17:22 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jani Nikula <jani.nikula@...el.com>,
	Uros Bizjak <ubizjak@...il.com>,
	Joel Granados <j.granados@...sung.com>
Subject: Re: [PATCH printk v7 29/35] printk: Coordinate direct printing in
 panic

On Sun 2024-08-04 02:57:32, John Ogness wrote:
> If legacy and nbcon consoles are registered and the nbcon
> consoles are allowed to flush (i.e. no boot consoles
> registered), the legacy consoles will no longer perform
> direct printing on the panic CPU until after the backtrace
> has been stored. This will give the safe nbcon consoles a
> chance to print the panic messages before allowing the
> unsafe legacy consoles to print.
> 
> If no nbcon consoles are registered or they are not allowed
> to flush, there is no change in behavior (i.e. legacy
> consoles will always attempt to print from the printk()
> caller context).
> 
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2330,12 +2332,30 @@ int vprintk_store(int facility, int level,
>  	return ret;
>  }
>  
> +static bool legacy_allow_panic_sync;
> +
> +/*
> + * This acts as a one-way switch to allow legacy consoles to print from
> + * the printk() caller context on a panic CPU. It also attempts to flush
> + * the legacy consoles in this context.
> + */
> +void printk_legacy_allow_panic_sync(void)
> +{
> +	legacy_allow_panic_sync = true;
> +
> +	if (printing_via_unlock && !is_printk_legacy_deferred()) {
> +		if (console_trylock())
> +			console_unlock();
> +	}
> +}
> +
>  asmlinkage int vprintk_emit(int facility, int level,
>  			    const struct dev_printk_info *dev_info,
>  			    const char *fmt, va_list args)
>  {
> +	bool do_trylock_unlock = printing_via_unlock;
> +	bool defer_legacy = !do_trylock_unlock;

This should be:

	bool defer_legacy = false;

If we do not need to call the legacy loop then we do not need
to do defer it. The nbcon consoles are flushed directly, see below.


>  	int printed_len;
> -	bool in_sched = false;
>  
>  	/* Suppress unimportant messages after panic happens */
>  	if (unlikely(suppress_printk))
> @@ -2349,17 +2369,35 @@ asmlinkage int vprintk_emit(int facility, int level,
>  	if (other_cpu_in_panic())
>  		return 0;
>  
> +	/* If called from the scheduler, we can not call up(). */
>  	if (level == LOGLEVEL_SCHED) {
>  		level = LOGLEVEL_DEFAULT;
> -		in_sched = true;
> +		do_trylock_unlock = false;
> +		defer_legacy = true;

And this should be:

		defer_legacy = do_trylock_unlock;
		do_trylock_unlock = false;

>  	}
>  
>  	printk_delay(level);
>  
>  	printed_len = vprintk_store(facility, level, dev_info, fmt, args);
>  
> -	/* If called from the scheduler, we can not call up(). */
> -	if (!in_sched && printing_via_unlock) {
> +	if (have_nbcon_console && !have_boot_console) {
> +		nbcon_atomic_flush_pending();

The nbcon consoles will always be flushed directly. At least at this
stage of the patchset. I guess that a later patch will offload it to
the kthread when possible.

> +		/*
> +		 * In panic, the legacy consoles are not allowed to print from
> +		 * the printk calling context unless explicitly allowed. This
> +		 * gives the safe nbcon consoles a chance to print out all the
> +		 * panic messages first. This restriction only applies if
> +		 * there are nbcon consoles registered and they are allowed to
> +		 * flush.
> +		 */
> +		if (this_cpu_in_panic() && !legacy_allow_panic_sync) {
> +			do_trylock_unlock = false;
> +			defer_legacy = false;
> +		}
> +	}
> +
> +	if (do_trylock_unlock) {
>  		/*
>  		 * The caller may be holding system-critical or
>  		 * timing-sensitive locks. Disable preemption during
> @@ -3292,7 +3330,9 @@ void console_flush_on_panic(enum con_flush_mode mode)
>  	if (!have_boot_console)
>  		nbcon_atomic_flush_pending();
>  
> -	console_flush_all(false, &next_seq, &handover);
> +	/* Flush legacy consoles once allowed, even when dangerous. */
> +	if (legacy_allow_panic_sync)
> +		console_flush_all(false, &next_seq, &handover);
>  }

This is a good change. It make the console_flush_on_panic()
API more safe to use. console_flush_all() should not be called
before stopping other CPUs.

Best Regards,
Petr


>  
>  /*
> -- 
> 2.39.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ