lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 17 Feb 2022 13:52:25 +0100
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Jiri Slaby <jirislaby@...nel.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Josh Triplett <josh@...htriplett.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Corey Minyard <cminyard@...sta.com>,
        Kees Cook <keescook@...omium.org>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Mark Brown <broonie@...nel.org>,
        Josef Bacik <josef@...icpanda.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Shawn Guo <shawn.guo@...aro.org>,
        Matti Vaittinen <matti.vaittinen@...rohmeurope.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Wang Qing <wangqing@...o.com>, Tejun Heo <tj@...nel.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        Alexander Potapenko <glider@...gle.com>,
        Stephen Boyd <sboyd@...nel.org>,
        Nicholas Piggin <npiggin@...il.com>, rcu@...r.kernel.org
Subject: Re: [PATCH printk v1 09/13] printk: add functions to allow direct
 printing

On Mon 2022-02-07 20:49:19, John Ogness wrote:
> Once kthread printing is introduced, console printing will no longer
> occur in the context of the printk caller. However, there are some
> special contexts where it is desirable for the printk caller to
> directly print out kernel messages. Using pr_flush() to wait for
> threaded printers is only possible if the caller is in a sleepable
> context and the kthreads are active. That is not always the case.
> 
> Introduce printk_direct_enter() and printk_direct_exit() functions
> to explicitly (and globally) activate/deactivate direct console
> printing.

We should make it more clear what direct console printing means.

It is just the best effort to print messages on consoles when
they are unused at the moment. By other words, it is the legacy
mode that uses trylock to get access to consoles.

> Activate direct printing for:
>  - sysrq
>  - emergency reboot/shutdown
>  - cpu and rcu stalls
>  - hard and soft lockups
>  - hung tasks
>  - stack dumps

It would be great to mention what rules of thumb were used to choose
these locations, see below.

> Signed-off-by: John Ogness <john.ogness@...utronix.de>
> ---
>  drivers/tty/sysrq.c     |  2 ++
>  include/linux/printk.h  | 11 +++++++++++
>  kernel/hung_task.c      | 11 ++++++++++-
>  kernel/printk/printk.c  | 39 ++++++++++++++++++++++++++++++++++++++-
>  kernel/rcu/tree_stall.h |  2 ++
>  kernel/reboot.c         | 14 +++++++++++++-
>  kernel/watchdog.c       |  4 ++++
>  kernel/watchdog_hld.c   |  4 ++++
>  lib/dump_stack.c        |  2 ++
>  lib/nmi_backtrace.c     |  2 ++
>  10 files changed, 88 insertions(+), 3 deletions(-)
> 
> --- a/drivers/tty/sysrq.c
> +++ b/drivers/tty/sysrq.c
> @@ -594,9 +594,11 @@ void __handle_sysrq(int key, bool check_mask)
>  		 * should not) and is the invoked operation enabled?
>  		 */
>  		if (!check_mask || sysrq_on_mask(op_p->enable_mask)) {
> +			printk_direct_enter();
>  			pr_info("%s\n", op_p->action_msg);
>  			console_loglevel = orig_log_level;
>  			op_p->handler(key);
> +			printk_direct_exit();
>  		} else {
>  			pr_info("This sysrq operation is disabled.\n");
>  			console_loglevel = orig_log_level;

We should handle all messages the same way, including "This sysrq operation
is disabled" and "HELP:" section. @suppress_printk is disabled
for all these messages as well.


> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3333,9 +3359,15 @@ static void wake_up_klogd_work_func(struct irq_work *irq_work)
>  	int pending = __this_cpu_xchg(printk_pending, 0);
>  
>  	if (pending & PRINTK_PENDING_OUTPUT) {
> +		if (pending & PRINTK_DIRECT_OUTPUT)
> +			printk_direct_enter();
> +
>  		/* If trylock fails, someone else is doing the printing */
>  		if (console_trylock())
>  			console_unlock();
> +
> +		if (pending & PRINTK_DIRECT_OUTPUT)
> +			printk_direct_exit();
>  	}

I want to be sure that we are on the same page.

This is a good example where the direct output is used in a
non-preemtive context. It means that it might cause soft or
live-lockups. And it might break RT guarantees.

It might be worth it when the system already is in big troubles
and RT guarantees are already broken. This might be true for:

    + emergency reboot/shutdown

We probably should use it also when @suppress_printk is already
disabled:

    + sysrq

But I am not sure about situations where only a particular process
or CPU is in troubles:

    + cpu and rcu stalls
    + hard and soft lockups
    + hung tasks
    + "generic" stack dumps


The risks:

On one hand, I like that it is a conservative approach. It reduces
the risk of bad user experience with switching to printk kthreads.

On the other hand, it means that the problem with soft/live-lockups
will still be there. Also the direct output is global and affects
any messages. It means that we will still need to keep and maintain
printk_deferred() for scheduler, NMI, console drivers, ...


My preferences:

I prefer to do the changes proposed by this patch (with updated sysrq
handling) in the 1st stage. It is more conservative. The switch to
kthreads in most other situations will be big enough step.

But I think that we might go even further in the future. After all,
the plan was to use the direct printing only when the system is about
to die. This might eventually allow to remove printk_deferred()
and all the possible soft/live-lockups.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ