[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1408fc88-e2c6-4f49-b581-0e9ad5620fe0@paulmck-laptop>
Date: Fri, 28 Feb 2025 07:41:40 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Uladzislau Rezki <urezki@...il.com>
Cc: Boqun Feng <boqun.feng@...il.com>, RCU <rcu@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Frederic Weisbecker <frederic@...nel.org>,
Cheung Wall <zzqq0103.hey@...il.com>,
Neeraj upadhyay <Neeraj.Upadhyay@....com>,
Joel Fernandes <joel@...lfernandes.org>,
Oleksiy Avramchenko <oleksiy.avramchenko@...y.com>
Subject: Re: [PATCH v4 3/3] rcu: Use _full() API to debug synchronize_rcu()
On Thu, Feb 27, 2025 at 06:44:15PM +0100, Uladzislau Rezki wrote:
> On Thu, Feb 27, 2025 at 09:26:40AM -0800, Paul E. McKenney wrote:
> > On Thu, Feb 27, 2025 at 09:12:39AM -0800, Boqun Feng wrote:
> > > Hi Ulad,
> > >
> > > I put these three patches into next (and misc.2025.02.27a) for some
> > > testing, hopefully it all goes well and they can make it v6.15.
> > >
> > > A few tag changed below:
> > >
> > > On Thu, Feb 27, 2025 at 02:16:13PM +0100, Uladzislau Rezki (Sony) wrote:
> > > > Switch for using of get_state_synchronize_rcu_full() and
> > > > poll_state_synchronize_rcu_full() pair to debug a normal
> > > > synchronize_rcu() call.
> > > >
> > > > Just using "not" full APIs to identify if a grace period is
> > > > passed or not might lead to a false-positive kernel splat.
> > > >
> > > > It can happen, because get_state_synchronize_rcu() compresses
> > > > both normal and expedited states into one single unsigned long
> > > > value, so a poll_state_synchronize_rcu() can miss GP-completion
> > > > when synchronize_rcu()/synchronize_rcu_expedited() concurrently
> > > > run.
> > > >
> > > > To address this, switch to poll_state_synchronize_rcu_full() and
> > > > get_state_synchronize_rcu_full() APIs, which use separate variables
> > > > for expedited and normal states.
> > > >
> > > > Link: https://lore.kernel.org/lkml/Z5ikQeVmVdsWQrdD@pc636/T/
> > >
> > > I switch this into "Closes:" per checkpatch.
> > >
> > > > Fixes: 988f569ae041 ("rcu: Reduce synchronize_rcu() latency")
> > > > Reported-by: cheung wall <zzqq0103.hey@...il.com>
> > > > Signed-off-by: Uladzislau Rezki (Sony) <urezki@...il.com>
> > >
> > > You seem to forget add Paul's Reviewed-by, so I add it in rcu/next.
> > > Would you or Paul double-check the Reviewed-by should be here?
> >
> > I am good with keeping my Reviewed-by tags.
> >
> Thanks Paul!
Except that I got this from overnight testing of rcu/dev on the shared
RCU tree:
WARNING: CPU: 5 PID: 14 at kernel/rcu/tree.c:1636 rcu_sr_normal_complete+0x5c/0x80
I see this only on TREE05. Which should not be too surprising, given
that this is the scenario that tests it. It happened within five minutes
on all 14 of the TREE05 runs.
This commit, just to avoid any ambiguity:
7cb48b64a563 ("MAINTAINERS: Update Joel's email address")
Thanx, Paul
Powered by blists - more mailing lists