linux-kernel - Re: [PATCH printk v7 24/35] printk: nbcon: Flush new records on device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZrDwZfGriZSxmjnp@pathway.suse.cz>
Date: Mon, 5 Aug 2024 17:31:49 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH printk v7 24/35] printk: nbcon: Flush new records on
 device_release()

On Sun 2024-08-04 02:57:27, John Ogness wrote:
> There may be new records that were added while a driver was
> holding the nbcon context for non-printing purposes. These
> new records must be flushed by the nbcon_device_release()
> context because no other context will do it.
> 
> If boot consoles are registered, the legacy loop is used
> (either direct or per irq_work) to handle the flushing.
> 
> Signed-off-by: John Ogness <john.ogness@...utronix.de>

It makes some sense and seems to work. I do not know how to
make it better.

Reviewed-by: Petr Mladek <pmladek@...e.com>

But it makes me nervous a bit, see below:

> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1326,10 +1326,30 @@ EXPORT_SYMBOL_GPL(nbcon_device_try_acquire);
>  void nbcon_device_release(struct console *con)
>  {
>  	struct nbcon_context *ctxt = &ACCESS_PRIVATE(con, nbcon_device_ctxt);
> +	int cookie;
>  
>  	if (!nbcon_context_exit_unsafe(ctxt))
>  		return;
>  
>  	nbcon_context_release(ctxt);
> +
> +	/*
> +	 * This context must flush any new records added while the console
> +	 * was locked. The console_srcu_read_lock must be taken to ensure
> +	 * the console is usable throughout flushing.
> +	 */
> +	cookie = console_srcu_read_lock();
> +	if (console_is_usable(con, console_srcu_read_flags(con)) &&
> +	    prb_read_valid(prb, nbcon_seq_read(con), NULL)) {
> +		if (!have_boot_console) {
> +			__nbcon_atomic_flush_pending_con(con, prb_next_reserve_seq(prb));
> +		} else if (!is_printk_legacy_deferred()) {
> +			if (console_trylock())
> +				console_unlock();

nbcon_device_release() is going to be called in uart_port_unlock*()
still under the port->lock.

=> It smells with a potential deadlock. The console_flush_all() in
   console_unlock() might want to flush this console under the
   port->lock as well.

   And it almost happens because nbcon_legacy_emit_next_record()
   might eventually take con->device_lock() when called in
   a task context.

   It will not happen here because this code uses console_trylock()
   which would set @console_may_schedule to false.

Anyway, it would look more safe when the flush was done after releasing
the port->lock.

I still have to think about this.

> +		} else {
> +			printk_trigger_flush();
> +		}
> +	}
> +	console_srcu_read_unlock(cookie);
>  }
>  EXPORT_SYMBOL_GPL(nbcon_device_release);

Best Regards,
Petr