[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZRRDiAUJMHAgiDnD@alley>
Date: Wed, 27 Sep 2023 17:00:24 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org,
"Paul E. McKenney" <paulmck@...nel.org>,
Frederic Weisbecker <frederic@...nel.org>,
Neeraj Upadhyay <quic_neeraju@...cinc.com>,
Joel Fernandes <joel@...lfernandes.org>,
Josh Triplett <josh@...htriplett.org>,
Boqun Feng <boqun.feng@...il.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Lai Jiangshan <jiangshanlai@...il.com>,
Zqiang <qiang.zhang1211@...il.com>, rcu@...r.kernel.org
Subject: Re: [PATCH printk v2 10/11] rcu: Add atomic write enforcement for
rcu stalls
On Wed 2023-09-20 01:14:55, John Ogness wrote:
> Invoke the atomic write enforcement functions for rcu stalls to
> ensure that the information gets out to the consoles.
>
> It is important to note that if there are any legacy consoles
> registered, they will be attempting to directly print from the
> printk-caller context, which may jeopardize the reliability of
> the atomic consoles. Optimally there should be no legacy
> consoles registered.
>
> Signed-off-by: John Ogness <john.ogness@...utronix.de>
> ---
> kernel/rcu/tree_stall.h | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
> index 6f06dc12904a..0a58f8b233d8 100644
> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -8,6 +8,7 @@
> */
>
> #include <linux/kvm_para.h>
> +#include <linux/console.h>
>
> //////////////////////////////////////////////////////////////////////////////
> //
> @@ -582,6 +583,7 @@ static void rcu_check_gp_kthread_expired_fqs_timer(void)
>
> static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
> {
> + enum nbcon_prio prev_prio;
> int cpu;
> unsigned long flags;
> unsigned long gpa;
> @@ -597,6 +599,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
> if (rcu_stall_is_suppressed())
> return;
>
> + prev_prio = nbcon_atomic_enter(NBCON_PRIO_EMERGENCY);
> +
> /*
> * OK, time to rat on our buddy...
> * See Documentation/RCU/stallwarn.rst for info on how to debug
> @@ -651,6 +655,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
> panic_on_rcu_stall();
>
> rcu_force_quiescent_state(); /* Kick them all. */
> +
> + nbcon_atomic_exit(NBCON_PRIO_EMERGENCY, prev_prio);
The locations looks reasonable to me. I just hope that we would
use another API: nbcon_emergency_enter()/exit() in the end.
Note that the new API it would allow to flush the messages in
the emergency context immediately from printk().
In that case, we would to handle nmi_trigger_cpumask_backtrace()
some special way.
This function would be called from the emergency context but
the nmi_cpu_backtrace() callbacks would be called on other
CPUs in normal context.
For this case I would add something like:
void nbcon_flush_all_emergency(void)
{
emum nbcon_prio = nbcon_get_default_prio();
if (nbcon_prio >= NBCON_PRIO_EMERGENCY)
nbcon_flush_all();
}
, where the POC of nbcon_get_default_prio() and nbcon_flush_all()
was in the replay to the 7th patch, see
https://lore.kernel.org/all/ZRLBxsXPCym2NC5Q@alley/
Best Regards,
Petr
Powered by blists - more mailing lists