[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9109c700-a353-4b12-a7c5-2f67e9ab4e86@paulmck-laptop>
Date: Wed, 13 Dec 2023 10:55:01 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Joel Fernandes <joel@...lfernandes.org>
Cc: "Neeraj Upadhyay (AMD)" <neeraj.iitr10@...il.com>,
rcu@...r.kernel.org, linux-kernel@...r.kernel.org,
kernel-team@...a.com, rostedt@...dmis.org, Neeraj.Upadhyay@....com,
Frederic Weisbecker <frederic@...nel.org>
Subject: Re: [PATCH rcu 3/3] srcu: Explain why callbacks invocations can't
run concurrently
On Wed, Dec 13, 2023 at 01:35:22PM -0500, Joel Fernandes wrote:
> On Wed, Dec 13, 2023 at 12:52 PM Paul E. McKenney <paulmck@...nel.org> wrote:
> >
> > On Wed, Dec 13, 2023 at 09:27:09AM -0500, Joel Fernandes wrote:
> > > On Tue, Dec 12, 2023 at 12:48 PM Neeraj Upadhyay (AMD)
> > > <neeraj.iitr10@...il.com> wrote:
> > > >
> > > > From: Frederic Weisbecker <frederic@...nel.org>
> > > >
> > > > If an SRCU barrier is queued while callbacks are running and a new
> > > > callbacks invocator for the same sdp were to run concurrently, the
> > > > RCU barrier might execute too early. As this requirement is non-obvious,
> > > > make sure to keep a record.
> > > >
> > > > Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
> > > > Reviewed-by: Joel Fernandes (Google) <joel@...lfernandes.org>
> > > > Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> > > > Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@...il.com>
> > > > ---
> > > > kernel/rcu/srcutree.c | 6 ++++++
> > > > 1 file changed, 6 insertions(+)
> > > >
> > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > > > index 2bfc8ed1eed2..0351a4e83529 100644
> > > > --- a/kernel/rcu/srcutree.c
> > > > +++ b/kernel/rcu/srcutree.c
> > > > @@ -1715,6 +1715,11 @@ static void srcu_invoke_callbacks(struct work_struct *work)
> > > > WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
> > > > rcu_segcblist_advance(&sdp->srcu_cblist,
> > > > rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> > > > + /*
> > > > + * Although this function is theoretically re-entrant, concurrent
> > > > + * callbacks invocation is disallowed to avoid executing an SRCU barrier
> > > > + * too early.
> > > > + */
> > >
> > > Side comment:
> > > I guess even without the barrier reasoning, it is best not to allow
> > > concurrent CB execution anyway since it diverges from the behavior of
> > > straight RCU :)
> >
> > Good point!
> >
> > But please do not forget item 12 on the list in checklist.rst. ;-)
> > (Which I just updated to include the other call_rcu*() functions.)
>
> I think this is more so now with recent kernels (with the dynamic nocb
> switch) than with older kernels right? I haven't kept up with the
> checklist recently (which is my bad).
You are quite correct! But even before this, I was saying that
lack of same-CPU callback concurrency was an accident of the current
implementation rather than a guarantee. For example, there might come
a time when RCU needs to respond to callback flooding with concurrent
execution of the flooded CPU's callbacks. Or not, but we do need to
keep this option open.
> My understanding comes from the fact that the RCU barrier depends on
> callbacks on the same CPU executing in order with straight RCU
> otherwise it breaks. Hence my comment. But as you pointed out, that's
> outdated knowledge.
That is still one motivation for ordered execution of callbacks. For the
dynamic nocb switch, we could have chosen to make rcu_barrier() place
a callback on both lists, but we instead chose to exclude rcu_barrier()
calls during the switch.
> I should just shut up and hide in shame now.
No need for that! After all, one motivation for Requirements.rst was
to help me keep track of all this stuff.
Thanx, Paul
> :-/
>
> - Joel
Powered by blists - more mailing lists