linux-kernel - Re: [PATCH printk 06/18] printk: Protect [un]register

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YzrzmHZYxp8iKEDT@alley>
Date:   Mon, 3 Oct 2022 16:37:12 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH printk 06/18] printk: Protect [un]register_console() with
 a mutex

On Fri 2022-09-30 22:32:32, John Ogness wrote:
> On 2022-09-30, Petr Mladek <pmladek@...e.com> wrote:
> > You want to get cosole_lock() out of the way of NOBKL consoles. What
> > does it exactly mean, please?
> 
> It means that a system with only NOBKL consoles will never take the
> console_lock.

What is exactly wrong with console_lock, please?

Is the main problem that it is a semaphore?

Or is it a problem that it is used in some console
drivers for other purposes?

My view:

If you use only NOBLK consoles then you should never take
console_lock via con->write(). Also the printk kthread
main-loop does not need to take console_lock.

If the above is true then console_lock should be needed
only by register_console() and unregister_console(). Anything
else should be doable via srcu_read_lock.

Is it a problem when console_lock is needed in register_console
and unregister_console on RT?

> > What code paths are important to achieve this?
> 
> Anything that iterates or queries the consoles is taking the
> console_lock right now. We want that code to use something else for
> those tasks (console_srcu, console_mutex, atomic_state).

IMHO, console_srcu should be enough for any query.

And even when the write lock was needed from some
reasons why mutex is needed? Is semaphore completely
unacceptable on RT? Do you want to avoid semaphore on RT
at any cost?

> My v2 will hopefully change your POV. I will make it clear (in comments
> and implementation) that the console_lock does _not_ protect the console
> list. All iteration and querying will have no choice but to use the new
> mechanisms for list iteration and checking/setting CON_ENABLED.

Before you spend too much time on it then please try to solve
the problem below.

> Then the console_lock's only function is to block legacy consoles from
> printing and making sure that multiple legacy consoles are not printing
> in parallel. And, of course, it will still function as a general BKL
> lock for fbdev, which may be relying on its locking function to
> synchronize some fbdev data.
> 
> Note that the end result will be no change in behavior for legacy
> consoles. But it allows legacy and NOBKL consoles to run simultaneously
> while sharing significant amounts of code, and provides a clear path for
> console drivers to begin converting. As a side-effect, the first step of
> reducing the scope of the console_lock will have been taken.

OK, there are people that want to disable kthreads by some command line
option. There is a non-trivial possibility that this "feature" will
be there forever.

How exactly do you want to support this legacy mode, please?

The above proposal suggests that it might be something like:

register_console()
{
	console_list_lock();

	if (!need_console())
		goto out;

	if (!try_enable_console())
		goto out;

	if (!(con->flags & CON_NOBLK))
		console_lock()

	add_console_into_the_list();

	if (!(con->flags & CON_NOBLK))
		console_unlock()

out:
	console_list_unlock();
}

vprintk_emit()
{
	vprintk_store();

	wake_up_klogd();

	if (only_noblk_consoles || in_sched)
		return;

	if (console_trylock()) {
		console_flush_all();
		__console_unlock();
}

console_flush_all()
{
	/*
	 * !!! WARNING !!!
	 *  Must take srcu_read_lock(&console_src) here.
	 *  Must never take console_list_lock() here.
	 */
	srcu_read_lock(&console_srcu);

	for_each_console() {
		...
	}

	srcu_read_unlock(&console_srcu);
}

The srcu_read_lock() is needed because NOBKL consoles are
added into the list without console_lock().

There are actually two reasons why we could not take
console_list_lock() in console_flush_all():

    + it is a sleeping lock and vprintk_emit() might
      be called in atomic context

    + it might cause ABBA deadlock with console_lock

IMHO, this is not obvious. The rules for using the three global
locks (console_lock(), console_list_lock(), console_srcu)
look quite complicated to me.

Anyway, are you able to implement vprintk_emit()/console_flush_all()
without console_lock()?

If we need to keep console_trylock() in vprintk_emit() forever
then we really need a good justification why console_list_lock()
is needed.

Please, show me a code path where console_mutex is needed
as the only acceptable solution for RT.

Best Regards,
Petr