[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220927143020.GK4196@paulmck-ThinkPad-P17-Gen-1>
Date: Tue, 27 Sep 2022 07:30:20 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Joel Fernandes <joel@...lfernandes.org>
Cc: Uladzislau Rezki <urezki@...il.com>, rcu@...r.kernel.org,
linux-kernel@...r.kernel.org, rushikesh.s.kadam@...el.com,
neeraj.iitr10@...il.com, frederic@...nel.org, rostedt@...dmis.org
Subject: Re: [PATCH v6 1/4] rcu: Make call_rcu() lazy to save power
On Tue, Sep 27, 2022 at 02:22:56PM +0000, Joel Fernandes wrote:
> On Tue, Sep 27, 2022 at 07:14:03AM -0700, Paul E. McKenney wrote:
> > On Tue, Sep 27, 2022 at 01:05:41PM +0000, Joel Fernandes wrote:
> > > On Mon, Sep 26, 2022 at 08:22:46PM -0700, Paul E. McKenney wrote:
> > > [..]
> > > > > > > >>> --- a/kernel/workqueue.c
> > > > > > > >>> +++ b/kernel/workqueue.c
> > > > > > > >>> @@ -1771,7 +1771,7 @@ bool queue_rcu_work(struct workqueue_struct *wq, struct rcu_work *rwork)
> > > > > > > >>>
> > > > > > > >>> if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
> > > > > > > >>> rwork->wq = wq;
> > > > > > > >>> - call_rcu(&rwork->rcu, rcu_work_rcufn);
> > > > > > > >>> + call_rcu_flush(&rwork->rcu, rcu_work_rcufn);
> > > > > > > >>> return true;
> > > > > > > >>> }
> > > > > > > >>>
> > > > > > > >>> <snip>
> > > > > > > >>>
> > > > > > > >>> ?
> > > > > > > >>>
> > > > > > > >>> But it does not fully solve my boot-up issue. Will debug tomorrow further.
> > > > > > > >>
> > > > > > > >> Ah, but at least its progress, thanks. Could you send me a patch to include
> > > > > > > >> in the next revision with details of this?
> > > > > > > >>
> > > > > > > >>>> Might one more proactive approach be to use Coccinelle to locate such
> > > > > > > >>>> callback functions? We might not want -all- callbacks that do wakeups
> > > > > > > >>>> to use call_rcu_flush(), but knowing which are which should speed up
> > > > > > > >>>> slow-boot debugging by quite a bit.
> > > > > > > >>>>
> > > > > > > >>>> Or is there a better way to do this?
> > > > > > > >>>>
> > > > > > > >>> I am not sure what Coccinelle is. If we had something automated that measures
> > > > > > > >>> a boot time and if needed does some profiling it would be good. Otherwise it
> > > > > > > >>> is a manual debugging mainly, IMHO.
> > > > > > > >>
> > > > > > > >> Paul, What about using a default-off kernel CONFIG that splats on all lazy
> > > > > > > >> call_rcu() callbacks that do a wake up. We could use the trace hooks to do it
> > > > > > > >> in kernel I think. I can talk to Steve to get ideas on how to do that but I
> > > > > > > >> think it can be done purely from trace events (we might need a new
> > > > > > > >> trace_end_invoke_callback to fire after the callback is invoked). Thoughts?
> > > > > > > >
> > > > > > > > Could you look for wakeups invoked between trace_rcu_batch_start() and
> > > > > > > > trace_rcu_batch_end() that are not from interrupt context? This would
> > > > > > > > of course need to be associated with a task rather than a CPU.
> > > > > > >
> > > > > > > Yes this sounds good, but we also need to know if the callbacks are
> > > > > > > lazy or not since wake-up is ok from a non lazy one. I think I’ll
> > > > > > > need a table to track that at queuing time.
> > > > > >
> > > > > > Agreed.
> > > > > >
> > > > > > > > Note that you would need to check for wakeups from interrupt handlers
> > > > > > > > even with the extra trace_end_invoke_callback(). The window where an
> > > > > > > > interrupt handler could do a wakeup would be reduced, but not eliminated.
> > > > > > >
> > > > > > > True! Since this is a debugging option, can we not just disable interrupts across callback invocation?
> > > > > >
> > > > > > Not without terminally annoying lockdep, at least for any RCU callbacks
> > > > > > doing things like spin_lock_bh().
> > > > > >
> > > > >
> > > > > Sorry if my last email bounced. Looks like my iPhone betrayed me this once ;)
> > > > >
> > > > > I was thinking something like this:
> > > > > 1. Put a flag in rcu_head to mark CBs as lazy.
> > > > > 2. Add a trace_rcu_invoke_callback_end() trace point.
> > > > >
> > > > > Both #1 and #2 can be a debug CONFIG option. #2 can be a tracepoint and not
> > > > > exposed if needed.
> > > > >
> > > > > 3. Put an in-kernel probe on both trace_rcu_invoke_callback_start() and
> > > > > trace_rcu_invoke_callback_end(). In the start probe, set a per-task flag if
> > > > > the current CB is lazy. In the end probe, clear it.
> > > > >
> > > > > 4. Put an in-kernel probe on trace_rcu_sched_wakeup().
> > > > >
> > > > > Splat in the wake up probe if:
> > > > > 1. Hard IRQs are on.
> > > > > 2. The per-cpu flag is set.
> > > > >
> > > > > #3 actually does not even need probes if we can directly call the functions
> > > > > from the rcu_do_batch() function.
> > > >
> > > > This is fine for an experiment or a debugging session, but a solution
> > > > based totally on instrumentation would be better for production use.
> > >
> > > Maybe we can borrow the least-significant bit of rhp->func to mark laziness?
> > > Then it can be production as long as we're ok with the trace_sched_wakeup
> > > probe.
> >
> > Last time I tried this, there were architectures that could have odd-valued
> > function addresses. Maybe this is no longer the case?
>
> Oh ok! If this happens, maybe we can just make it depend on x86-64 assuming
> x86-64 does not have pointer oddness. We can also add a warning for if the
> function address is odd before setting the bit.
Let me rephrase this... ;-)
Given that this used to not work and still might not work, let's see
if we can find some other way to debug this. Unless and until it can
be demonstrated that there is no supported compiler that will generated
odd-valued function addresses on any supported architecture.
Plus there was a time that x86 did odd-valued pointer addresses.
The instruction set is plenty fine with this, so it would have to be a
compiler and assembly-language convention to avoid it.
Thanx, Paul
Powered by blists - more mailing lists