[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEXW_YRQiuvsy1FsMNWG7wd9ah_gfgcOUAeNzA-QbmDcACa+Uw@mail.gmail.com>
Date: Tue, 28 Jun 2022 17:13:21 -0400
From: Joel Fernandes <joel@...lfernandes.org>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Uladzislau Rezki <urezki@...il.com>, rcu <rcu@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Rushikesh S Kadam <rushikesh.s.kadam@...el.com>,
Neeraj upadhyay <neeraj.iitr10@...il.com>,
Frederic Weisbecker <frederic@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>, vineeth@...byteword.org
Subject: Re: [PATCH v2 8/8] rcu/kfree: Fix kfree_rcu_shrink_count() return value
On Tue, Jun 28, 2022 at 12:56 PM Joel Fernandes <joel@...lfernandes.org> wrote:
>
> On Mon, Jun 27, 2022 at 02:43:59PM -0700, Paul E. McKenney wrote:
> > On Mon, Jun 27, 2022 at 09:18:13PM +0000, Joel Fernandes wrote:
> > > On Mon, Jun 27, 2022 at 01:59:07PM -0700, Paul E. McKenney wrote:
> > > > On Mon, Jun 27, 2022 at 08:56:43PM +0200, Uladzislau Rezki wrote:
> > > > > > As per the comments in include/linux/shrinker.h, .count_objects callback
> > > > > > should return the number of freeable items, but if there are no objects
> > > > > > to free, SHRINK_EMPTY should be returned. The only time 0 is returned
> > > > > > should be when we are unable to determine the number of objects, or the
> > > > > > cache should be skipped for another reason.
> > > > > >
> > > > > > Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
> > > > > > ---
> > > > > > kernel/rcu/tree.c | 2 +-
> > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > > index 711679d10cbb..935788e8d2d7 100644
> > > > > > --- a/kernel/rcu/tree.c
> > > > > > +++ b/kernel/rcu/tree.c
> > > > > > @@ -3722,7 +3722,7 @@ kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> > > > > > atomic_set(&krcp->backoff_page_cache_fill, 1);
> > > > > > }
> > > > > >
> > > > > > - return count;
> > > > > > + return count == 0 ? SHRINK_EMPTY : count;
> > > > > > }
> > > > > >
> > > > > > static unsigned long
> > > > > > --
> > > > > > 2.37.0.rc0.104.g0611611a94-goog
> > > > > >
> > > > > Looks good to me!
> > > > >
> > > > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@...il.com>
> > > >
> > > > Now that you mention it, this does look independent of the rest of
> > > > the series. I have pulled it in with Uladzislau's Reviewed-by.
> > >
> > > Thanks Paul and Vlad!
> > >
> > > Paul, apologies for being quiet. I have been working on the series and the
> > > review comments carefully. I appreciate your help with this work.
> >
> > Not a problem. After all, this stuff is changing some of the trickier
> > parts of RCU. We must therefore assume that some significant time and
> > effort will be required to get it right.
>
> To your point about trickier parts of RCU, the v2 series though I tested it
> before submitting is now giving me strange results with rcuscale. Sometimes
> laziness does not seem to be in effect (as pointed out by rcuscale), other
> times I am seeing stalls.
>
> So I have to carefully look through all of this again. I am not sure why I
> was not seeing these issues with the exact same code before (frustrated).
Looks like I found at least 3 bugs in my v2 series which testing
picked up now. RCU-lazy was being too lazy or not too lazy. Now tests
pass, so its progress but does beg for more testing:
On top of v2 series:
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index c06a96b6a18a..7021ee05155d 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -292,7 +292,8 @@ static void wake_nocb_gp_defer(struct rcu_data
*rdp, int waketype,
*/
switch (waketype) {
case RCU_NOCB_WAKE_LAZY:
- mod_jif = jiffies_till_flush;
+ if (rdp->nocb_defer_wakeup != RCU_NOCB_WAKE_LAZY)
+ mod_jif = jiffies_till_flush;
break;
case RCU_NOCB_WAKE_BYPASS:
@@ -714,13 +715,13 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
lazy_ncbs = rcu_cblist_n_lazy_cbs(&rdp->nocb_bypass);
if (lazy_ncbs &&
- (time_after(j, READ_ONCE(rdp->nocb_bypass_first) +
LAZY_FLUSH_JIFFIES) ||
+ (time_after(j, READ_ONCE(rdp->nocb_bypass_first) +
jiffies_till_flush) ||
bypass_ncbs > qhimark)) {
// Bypass full or old, so flush it.
(void)rcu_nocb_try_flush_bypass(rdp, j);
bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
lazy_ncbs = rcu_cblist_n_lazy_cbs(&rdp->nocb_bypass);
- } else if (bypass_ncbs &&
+ } else if (bypass_ncbs && (lazy_ncbs != bypass_ncbs) &&
(time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) ||
bypass_ncbs > 2 * qhimark)) {
// Bypass full or old, so flush it.
Powered by blists - more mailing lists