lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 30 Jun 2022 14:25:16 +0000
From:   Joel Fernandes <joel@...lfernandes.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Uladzislau Rezki <urezki@...il.com>, rcu <rcu@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Rushikesh S Kadam <rushikesh.s.kadam@...el.com>,
        Neeraj upadhyay <neeraj.iitr10@...il.com>,
        Frederic Weisbecker <frederic@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>, vineeth@...byteword.org
Subject: Re: [PATCH v2 8/8] rcu/kfree: Fix kfree_rcu_shrink_count() return
 value

On Wed, Jun 29, 2022 at 02:07:20PM -0700, Paul E. McKenney wrote:
> On Wed, Jun 29, 2022 at 07:47:36PM +0000, Joel Fernandes wrote:
> > On Wed, Jun 29, 2022 at 09:56:27AM -0700, Paul E. McKenney wrote:
> > > On Tue, Jun 28, 2022 at 05:13:21PM -0400, Joel Fernandes wrote:
> > > > On Tue, Jun 28, 2022 at 12:56 PM Joel Fernandes <joel@...lfernandes.org> wrote:
> > > > >
> > > > > On Mon, Jun 27, 2022 at 02:43:59PM -0700, Paul E. McKenney wrote:
> > > > > > On Mon, Jun 27, 2022 at 09:18:13PM +0000, Joel Fernandes wrote:
> > > > > > > On Mon, Jun 27, 2022 at 01:59:07PM -0700, Paul E. McKenney wrote:
> > > > > > > > On Mon, Jun 27, 2022 at 08:56:43PM +0200, Uladzislau Rezki wrote:
> > > > > > > > > > As per the comments in include/linux/shrinker.h, .count_objects callback
> > > > > > > > > > should return the number of freeable items, but if there are no objects
> > > > > > > > > > to free, SHRINK_EMPTY should be returned. The only time 0 is returned
> > > > > > > > > > should be when we are unable to determine the number of objects, or the
> > > > > > > > > > cache should be skipped for another reason.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
> > > > > > > > > > ---
> > > > > > > > > >  kernel/rcu/tree.c | 2 +-
> > > > > > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > > > > > > index 711679d10cbb..935788e8d2d7 100644
> > > > > > > > > > --- a/kernel/rcu/tree.c
> > > > > > > > > > +++ b/kernel/rcu/tree.c
> > > > > > > > > > @@ -3722,7 +3722,7 @@ kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> > > > > > > > > >               atomic_set(&krcp->backoff_page_cache_fill, 1);
> > > > > > > > > >       }
> > > > > > > > > >
> > > > > > > > > > -     return count;
> > > > > > > > > > +     return count == 0 ? SHRINK_EMPTY : count;
> > > > > > > > > >  }
> > > > > > > > > >
> > > > > > > > > >  static unsigned long
> > > > > > > > > > --
> > > > > > > > > > 2.37.0.rc0.104.g0611611a94-goog
> > > > > > > > > >
> > > > > > > > > Looks good to me!
> > > > > > > > >
> > > > > > > > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@...il.com>
> > > > > > > >
> > > > > > > > Now that you mention it, this does look independent of the rest of
> > > > > > > > the series.  I have pulled it in with Uladzislau's Reviewed-by.
> > > > > > >
> > > > > > > Thanks Paul and Vlad!
> > > > > > >
> > > > > > > Paul, apologies for being quiet. I have been working on the series and the
> > > > > > > review comments carefully. I appreciate your help with this work.
> > > > > >
> > > > > > Not a problem.  After all, this stuff is changing some of the trickier
> > > > > > parts of RCU.  We must therefore assume that some significant time and
> > > > > > effort will be required to get it right.
> > > > >
> > > > > To your point about trickier parts of RCU, the v2 series though I tested it
> > > > > before submitting is now giving me strange results with rcuscale. Sometimes
> > > > > laziness does not seem to be in effect (as pointed out by rcuscale), other
> > > > > times I am seeing stalls.
> > > > >
> > > > > So I have to carefully look through all of this again. I am not sure why I
> > > > > was not seeing these issues with the exact same code before (frustrated).
> > > > 
> > > > Looks like I found at least 3 bugs in my v2 series which testing
> > > > picked up now. RCU-lazy was being too lazy or not too lazy. Now tests
> > > > pass, so its progress but does beg for more testing:
> > > 
> > > It is entirely possible that call_rcu_lazy() needs its own special
> > > purpose tests.  This might be a separate test parallel to the test for
> > > kfree_rcu() in kernel/rcu/rcuscale.c, for example.
> > 
> > I see, perhaps I can add a 'lazy' flag to rcutorture as well, so it uses
> > call_rcu_lazy() for its async RCU invocations?
> 
> That will be tricky because of rcutorture's timeliness expectations.

I have facility now to set the lazy timeout from test kernel modules. I was
thinking I could set the same from rcu torture. Maybe something like a 100
jiffies? Then it can run through all the regular rcutorture tests and
still exercise the new code paths.

> Maybe a self-invoking lazy callback initiated by rcu_torture_fakewriter()
> that prints a line about its statistics at shutdown time?  At a minimum,
> the number of times that it was invoked.  Better would be to print one
> line summarizing stats for all of them.
> 
> The main thing that could be detected from this is a callback being
> stranded.  Given that rcutorture enqueues non-lazy callbacks like a
> drunken sailor, they won't end up being all that lazy.

Thanks for this idea as well. I'll think more about it. thanks,

 - Joel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ