[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZrIUUhE_RDxizKcN@pathway.suse.cz>
Date: Tue, 6 Aug 2024 14:17:22 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jani Nikula <jani.nikula@...el.com>,
Uros Bizjak <ubizjak@...il.com>,
Joel Granados <j.granados@...sung.com>
Subject: Re: [PATCH printk v7 29/35] printk: Coordinate direct printing in
panic
On Sun 2024-08-04 02:57:32, John Ogness wrote:
> If legacy and nbcon consoles are registered and the nbcon
> consoles are allowed to flush (i.e. no boot consoles
> registered), the legacy consoles will no longer perform
> direct printing on the panic CPU until after the backtrace
> has been stored. This will give the safe nbcon consoles a
> chance to print the panic messages before allowing the
> unsafe legacy consoles to print.
>
> If no nbcon consoles are registered or they are not allowed
> to flush, there is no change in behavior (i.e. legacy
> consoles will always attempt to print from the printk()
> caller context).
>
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2330,12 +2332,30 @@ int vprintk_store(int facility, int level,
> return ret;
> }
>
> +static bool legacy_allow_panic_sync;
> +
> +/*
> + * This acts as a one-way switch to allow legacy consoles to print from
> + * the printk() caller context on a panic CPU. It also attempts to flush
> + * the legacy consoles in this context.
> + */
> +void printk_legacy_allow_panic_sync(void)
> +{
> + legacy_allow_panic_sync = true;
> +
> + if (printing_via_unlock && !is_printk_legacy_deferred()) {
> + if (console_trylock())
> + console_unlock();
> + }
> +}
> +
> asmlinkage int vprintk_emit(int facility, int level,
> const struct dev_printk_info *dev_info,
> const char *fmt, va_list args)
> {
> + bool do_trylock_unlock = printing_via_unlock;
> + bool defer_legacy = !do_trylock_unlock;
This should be:
bool defer_legacy = false;
If we do not need to call the legacy loop then we do not need
to do defer it. The nbcon consoles are flushed directly, see below.
> int printed_len;
> - bool in_sched = false;
>
> /* Suppress unimportant messages after panic happens */
> if (unlikely(suppress_printk))
> @@ -2349,17 +2369,35 @@ asmlinkage int vprintk_emit(int facility, int level,
> if (other_cpu_in_panic())
> return 0;
>
> + /* If called from the scheduler, we can not call up(). */
> if (level == LOGLEVEL_SCHED) {
> level = LOGLEVEL_DEFAULT;
> - in_sched = true;
> + do_trylock_unlock = false;
> + defer_legacy = true;
And this should be:
defer_legacy = do_trylock_unlock;
do_trylock_unlock = false;
> }
>
> printk_delay(level);
>
> printed_len = vprintk_store(facility, level, dev_info, fmt, args);
>
> - /* If called from the scheduler, we can not call up(). */
> - if (!in_sched && printing_via_unlock) {
> + if (have_nbcon_console && !have_boot_console) {
> + nbcon_atomic_flush_pending();
The nbcon consoles will always be flushed directly. At least at this
stage of the patchset. I guess that a later patch will offload it to
the kthread when possible.
> + /*
> + * In panic, the legacy consoles are not allowed to print from
> + * the printk calling context unless explicitly allowed. This
> + * gives the safe nbcon consoles a chance to print out all the
> + * panic messages first. This restriction only applies if
> + * there are nbcon consoles registered and they are allowed to
> + * flush.
> + */
> + if (this_cpu_in_panic() && !legacy_allow_panic_sync) {
> + do_trylock_unlock = false;
> + defer_legacy = false;
> + }
> + }
> +
> + if (do_trylock_unlock) {
> /*
> * The caller may be holding system-critical or
> * timing-sensitive locks. Disable preemption during
> @@ -3292,7 +3330,9 @@ void console_flush_on_panic(enum con_flush_mode mode)
> if (!have_boot_console)
> nbcon_atomic_flush_pending();
>
> - console_flush_all(false, &next_seq, &handover);
> + /* Flush legacy consoles once allowed, even when dangerous. */
> + if (legacy_allow_panic_sync)
> + console_flush_all(false, &next_seq, &handover);
> }
This is a good change. It make the console_flush_on_panic()
API more safe to use. console_flush_all() should not be called
before stopping other CPUs.
Best Regards,
Petr
>
> /*
> --
> 2.39.2
Powered by blists - more mailing lists