[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87f074af-a39c-48db-972f-5f46e57aa3db@paulmck-laptop>
Date: Wed, 5 Feb 2025 07:56:06 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Joel Fernandes <joel@...lfernandes.org>
Cc: linux-kernel@...r.kernel.org, Frederic Weisbecker <frederic@...nel.org>,
Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
Josh Triplett <josh@...htriplett.org>,
Boqun Feng <boqun.feng@...il.com>,
Uladzislau Rezki <urezki@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Lai Jiangshan <jiangshanlai@...il.com>,
Zqiang <qiang.zhang1211@...il.com>, rcu@...r.kernel.org
Subject: Re: [PATCH] rcu: Merge rcu_seq_done_exact() logic into rcu_seq_done()
On Wed, Feb 05, 2025 at 10:45:08AM -0500, Joel Fernandes wrote:
> On Wed, Feb 5, 2025 at 5:28 AM Paul E. McKenney <paulmck@...nel.org> wrote:
> >
> > On Tue, Feb 04, 2025 at 08:28:29PM -0500, Joel Fernandes wrote:
> > > On Tue, Feb 4, 2025 at 3:22 PM Paul E. McKenney <paulmck@...nel.org> wrote:
> > > >
> > > > On Tue, Feb 04, 2025 at 10:44:45AM -0500, Joel Fernandes wrote:
> > > > > On Thu, Jan 30, 2025 at 12:56 AM Paul E. McKenney <paulmck@...nel.org> wrote:
> > > > > > By "where we were initially", one might reasonably guess that you meant
> > > > > > that the initial value and the target are one and the same. But I suspect
> > > > > > that you are thinking of a third value, the value of the underlying
> > > > > > grace-period sequence number at the time of the call to rcu_seq_snap().
> > > > > > But you might be thinking of something else entirely.
> > > > > >
> > > > > > > Now we move rcu_seq_done_exact. It has the exact same behavior except
> > > > > > > that instead of ULONG_MAX/2, the above situations are shrunk to about
> > > > > > > 10 counts from the target. So if target is 28, then the initial
> > > > > > > sequence should have been at least 18 to avoid false-positive, but
> > > > > > > again it is possible and I certain see in the code that it cannot be
> > > > > > > used this way.. (at least so far).. So all we are left with is the
> > > > > > > false-negative band of ~2.5 GPs..
> > > > > >
> > > > > > Here, I have the same questions. As you no doubt guessed.
> > > > > >
> > > > > > > About "what are the consequences of failing to get this right". I
> > > > > > > believe false-positive is unacceptable unless it is maybe debug code -
> > > > > > > that can break functionality in code, too short GPs and all that.
> > > > > > > However false-negative is OK as long as the usecase can retry later
> > > > > > > and can afford to wait. Did I get that much correct?
> > > > > >
> > > > > > Maybe?
> > > > > >
> > > > > > Please look at this on a per-use-case basis.
> > > > >
> > > > > Sure. FWIW, I started a world-editable document here. I am planning to
> > > > > work on this a bit more before our meeting this week. If others
> > > > > knowledgeable like Frederic and others could make edits/comments,
> > > > > that'd be welcomed. But I basically summarized this whole thread here:
> > > > > https://docs.google.com/document/d/1sFNuGH4U6DFCE8sunbdWMjFhJhb1aG4goOJAnS6c6gQ/edit?usp=sharing
> > > > >
> > > > > My thought is (AFAICS), this patch is still valid and
> > > > > rcu_seq_done_exact is a potentially better replacement to
> > > > > rcu_seq_done. But famous last words...
> > > >
> > > > Let's start with the use case where the sequence number is being used
> > > > by rcu_barrier(). What happens?
> > >
> > > The rcu_state.barrier_sequence is used to keep track of barrier in
> > > progress, as there can be concurrent rcu_barrier() calls happening so
> > > we optimize by batching multiples of these calls.
> > >
> > > It also handles the case where the entrain IPI came through but no
> > > rcu_barrier() is currently in-flight (which would be weird but maybe
> > > it happens somehow).
> > >
> > > It is also used to make sure we don't need to entrain on a CPU again
> > > if we already did an entraining for a certain rcu_barrier() invocation
> > > already. This is tracked by rdp->barrier_seq_snap which copies the
> > > value of rcu_state.barrier_sequence after the entraining. Al though I
> > > am currently unable to envision why the IPI handler would be called
> > > unnecessarily (hardware bug?) since we hold the rcu_barrier mutex
> > > throughout the rcu_barrier() function, including while waiting on
> > > CPUs.
> > >
> > > Going back to my first point, for the rcu_seq_done() in rcu_barrier(),
> > > we are checking if someone else did our work:
> > > mutex_lock(&rcu_state.barrier_mutex);
> > > /* Did someone else do our work for us? */
> > > if (rcu_seq_done(&rcu_state.barrier_sequence, s))
> > >
> > > Here I think rcu_seq_done_exact() is better, because if too many
> > > rcu_barrier() calls concurrently made at the same time, the mutex may
> > > be contended. If enough time passes and we have a wrap around, we
> > > might end up in false-negative land which is much larger with
> > > rcu_seq_done() than ..done_exact(). However, that does seem to be a
> > > bug to begin with if the mutex contended for that long. OTOH, doing an
> > > rcu_barrier() again when we didn't need to doesn't seem like the end
> > > of the world especially if we contended on the mutex for that long.
> > > But at least it shows that rcu_seq_done_exact() cannot seem to hurt
> > > here.
> > >
> > > Hmmm,
> >
> > So... Do we really care about the difference between rcu_seq_done()
> > and rcu_seq_done_exact() in the rcu_barrier() case?
>
> In so far as to get rid of rcu_seq_done() API in favor of using
> rcu_seq_done_exact() but should not cause any functional or robustness
> changes to rcu_barrier() at least, AFAICS.
OK, *that* I might be able to get behind. ;-)
Thanx, Paul
Powered by blists - more mailing lists