[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z9qK_QGK2UsJqLOR@p200300d06f3e98759ed3c196478e337b.dip0.t-ipconnect.de>
Date: Wed, 19 Mar 2025 10:14:37 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Boqun Feng <boqun.feng@...il.com>,
Joel Fernandes <joelagnelf@...dia.com>,
Neeraj Upadhyay <neeraj.upadhyay@....com>,
Uladzislau Rezki <urezki@...il.com>,
Zqiang <qiang.zhang1211@...il.com>, rcu <rcu@...r.kernel.org>
Subject: Re: [PATCH 4/5] rcu/exp: Warn on QS requested on dying CPU
Le Tue, Mar 18, 2025 at 10:21:48AM -0700, Paul E. McKenney a écrit :
> On Fri, Mar 14, 2025 at 03:36:41PM +0100, Frederic Weisbecker wrote:
> > It is not possible to send an IPI to a dying CPU that has passed the
> > CPUHP_TEARDOWN_CPU stage. Remaining unhandled IPIs are handled later at
> > CPUHP_AP_SMPCFD_DYING stage by stop machine. This is the last
> > opportunity for RCU exp handler to request an expedited quiescent state.
> > And the upcoming final context switch between stop machine and idle must
> > have reported the requested context switch.
> >
> > Therefore, it should not be possible to observe a pending requested
> > expedited quiescent state when RCU finally stops watching the outgoing
> > CPU. Once IPIs aren't possible anymore, the QS for the target CPU will
> > be reported on its behalf by the RCU exp kworker.
> >
> > Provide an assertion to verify those expectations.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
>
> But what do we do if this assertion triggers?
It means there is likely something to fix because an IPI has been sent
and somehow the CPU missed it.
> And do we want it to take
> effect only in kernels built with CONFIG_PROVE_RCU? Or is such a broken
> assumption bad enough to justify a splat in production kernels?
>
> If the answer to the last question is "yes" (and you, not me, work for
> a distro, so it is your question to answer):
I think it's bad enough to deserve a real warning. Also this is a slow path.
>
> Reviewed-by: Paul E. McKenney <paulmck@...nel.org>
Thanks!
>
> Thanx, Paul
>
> > ---
> > kernel/rcu/tree.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 3fe68057d8b4..79dced5fb72e 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -4321,6 +4321,12 @@ void rcutree_report_cpu_dead(void)
> > * may introduce a new READ-side while it is actually off the QS masks.
> > */
> > lockdep_assert_irqs_disabled();
> > + /*
> > + * CPUHP_AP_SMPCFD_DYING was the last call for rcu_exp_handler() execution.
> > + * The requested QS must have been reported on the last context switch
> > + * from stop machine to idle.
> > + */
> > + WARN_ON_ONCE(rdp->cpu_no_qs.b.exp);
> > // Do any dangling deferred wakeups.
> > do_nocb_deferred_wakeup(rdp);
> >
> > --
> > 2.48.1
> >
Powered by blists - more mailing lists