[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ysh6yWThHu6GAfJM@google.com>
Date: Fri, 8 Jul 2022 18:43:21 +0000
From: Joel Fernandes <joel@...lfernandes.org>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: rcu@...r.kernel.org, linux-kernel@...r.kernel.org,
rushikesh.s.kadam@...el.com, urezki@...il.com,
neeraj.iitr10@...il.com, frederic@...nel.org, rostedt@...dmis.org,
vineeth@...byteword.org
Subject: Re: [PATCH v2 1/8] rcu: Introduce call_rcu_lazy() API implementation
On Sat, Jun 25, 2022 at 09:00:19PM -0700, Paul E. McKenney wrote:
> On Wed, Jun 22, 2022 at 10:50:55PM +0000, Joel Fernandes (Google) wrote:
> > Implement timer-based RCU lazy callback batching. The batch is flushed
> > whenever a certain amount of time has passed, or the batch on a
> > particular CPU grows too big. Also memory pressure will flush it in a
> > future patch.
> >
> > To handle several corner cases automagically (such as rcu_barrier() and
> > hotplug), we re-use bypass lists to handle lazy CBs. The bypass list
> > length has the lazy CB length included in it. A separate lazy CB length
> > counter is also introduced to keep track of the number of lazy CBs.
> >
> > Suggested-by: Paul McKenney <paulmck@...nel.org>
> > Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
>
> Not bad, but some questions and comments below.
Thanks a lot for these, real helpful and I replied below:
> > diff --git a/include/linux/rcu_segcblist.h b/include/linux/rcu_segcblist.h
> > index 659d13a7ddaa..9a992707917b 100644
> > --- a/include/linux/rcu_segcblist.h
> > +++ b/include/linux/rcu_segcblist.h
> > @@ -22,6 +22,7 @@ struct rcu_cblist {
> > struct rcu_head *head;
> > struct rcu_head **tail;
> > long len;
> > + long lazy_len;
> > };
> >
> > #define RCU_CBLIST_INITIALIZER(n) { .head = NULL, .tail = &n.head }
> > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > index 1a32036c918c..9191a3d88087 100644
> > --- a/include/linux/rcupdate.h
> > +++ b/include/linux/rcupdate.h
> > @@ -82,6 +82,12 @@ static inline int rcu_preempt_depth(void)
> >
> > #endif /* #else #ifdef CONFIG_PREEMPT_RCU */
> >
> > +#ifdef CONFIG_RCU_LAZY
> > +void call_rcu_lazy(struct rcu_head *head, rcu_callback_t func);
> > +#else
> > +#define call_rcu_lazy(head, func) call_rcu(head, func)
> > +#endif
> > +
> > /* Internal to kernel */
> > void rcu_init(void);
> > extern int rcu_scheduler_active;
> > diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> > index 27aab870ae4c..0bffa992fdc4 100644
> > --- a/kernel/rcu/Kconfig
> > +++ b/kernel/rcu/Kconfig
> > @@ -293,4 +293,12 @@ config TASKS_TRACE_RCU_READ_MB
> > Say N here if you hate read-side memory barriers.
> > Take the default if you are unsure.
> >
> > +config RCU_LAZY
> > + bool "RCU callback lazy invocation functionality"
> > + depends on RCU_NOCB_CPU
> > + default n
> > + help
> > + To save power, batch RCU callbacks and flush after delay, memory
> > + pressure or callback list growing too big.
>
> Spaces vs. tabs.
Fixed, thanks.
> The checkpatch warning is unhelpful ("please write a help paragraph that
> fully describes the config symbol")
Good old checkpatch :D
> > endmenu # "RCU Subsystem"
> > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> > index c54ea2b6a36b..627a3218a372 100644
> > --- a/kernel/rcu/rcu_segcblist.c
> > +++ b/kernel/rcu/rcu_segcblist.c
> > @@ -20,6 +20,7 @@ void rcu_cblist_init(struct rcu_cblist *rclp)
> > rclp->head = NULL;
> > rclp->tail = &rclp->head;
> > rclp->len = 0;
> > + rclp->lazy_len = 0;
> > }
> >
> > /*
> > @@ -32,6 +33,15 @@ void rcu_cblist_enqueue(struct rcu_cblist *rclp, struct rcu_head *rhp)
> > WRITE_ONCE(rclp->len, rclp->len + 1);
> > }
> >
> > +/*
> > + * Enqueue an rcu_head structure onto the specified callback list.
>
> Please also note the fact that it is enqueuing lazily.
Sorry, done.
> > + */
> > +void rcu_cblist_enqueue_lazy(struct rcu_cblist *rclp, struct rcu_head *rhp)
> > +{
> > + rcu_cblist_enqueue(rclp, rhp);
> > + WRITE_ONCE(rclp->lazy_len, rclp->lazy_len + 1);
>
> Except... Why not just add a "lazy" parameter to rcu_cblist_enqueue()?
> IS_ENABLED() can make it fast.
Yeah good idea, it simplifies the code too. Thank you!
So you mean I should add in this function so that the branch gets optimized:
if (lazy && IS_ENABLE(CONFIG_RCU_LAZY)) {
...
}
That makes total sense considering the compiler may otherwise not be able to
optimize the function viewing just the individual translation unit. I fixed
it.
The 6 month old baby and wife are calling my attention now. I will continue
to reply to the other parts of this and other emails this evening and thanks
for your help!
thanks,
- Joel
Powered by blists - more mailing lists