linux-kernel - Re: [PATCH] rcu: Make jiffies_till_sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190715200901.GW26519@linux.ibm.com>
Date:   Mon, 15 Jul 2019 13:09:01 -0700
From:   "Paul E. McKenney" <paulmck@...ux.ibm.com>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     Byungchul Park <max.byungchul.park@...il.com>,
        Byungchul Park <byungchul.park@....com>,
        Josh Triplett <josh@...htriplett.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        rcu <rcu@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
        kernel-team@....com
Subject: Re: [PATCH] rcu: Make jiffies_till_sched_qs writable

On Mon, Jul 15, 2019 at 01:39:24PM -0400, Joel Fernandes wrote:
> On Sun, Jul 14, 2019 at 06:56:10AM -0700, Paul E. McKenney wrote:
> [snip]
> > > > > > > > > >
> > > > > > > > > >       commit ae91aa0adb14dc33114d566feca2f7cb7a96b8b7
> > > > > > > > > >       rcu: Remove debugfs tracing
> > > > > > > > > >
> > > > > > > > > > removed all debugfs tracing, gp_max also included.
> > > > > > > > > >
> > > > > > > > > > And you sounds great. And even looks not that hard to add it like,
> > > > > > > > > >
> > > > > > > > > > :)
> > > > > > > > > >
> > > > > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > > > > > > index ad9dc86..86095ff 100644
> > > > > > > > > > --- a/kernel/rcu/tree.c
> > > > > > > > > > +++ b/kernel/rcu/tree.c
> > > > > > > > > > @@ -1658,8 +1658,10 @@ static void rcu_gp_cleanup(void)
> > > > > > > > > >       raw_spin_lock_irq_rcu_node(rnp);
> > > > > > > > > >       rcu_state.gp_end = jiffies;
> > > > > > > > > >       gp_duration = rcu_state.gp_end - rcu_state.gp_start;
> > > > > > > > > > -     if (gp_duration > rcu_state.gp_max)
> > > > > > > > > > +     if (gp_duration > rcu_state.gp_max) {
> > > > > > > > > >               rcu_state.gp_max = gp_duration;
> > > > > > > > > > +             trace_rcu_grace_period(something something);
> > > > > > > > > > +     }
> > > > > > > > >
> > > > > > > > > Yes, that makes sense. But I think it is much better off as a readable value
> > > > > > > > > from a virtual fs. The drawback of tracing for this sort of thing are:
> > > > > > > > >  - Tracing will only catch it if tracing is on
> > > > > > > > >  - Tracing data can be lost if too many events, then no one has a clue what
> > > > > > > > >    the max gp time is.
> > > > > > > > >  - The data is already available in rcu_state::gp_max so copying it into the
> > > > > > > > >    trace buffer seems a bit pointless IMHO
> > > > > > > > >  - It is a lot easier on ones eyes to process a single counter than process
> > > > > > > > >    heaps of traces.
> > > > > > > > >
> > > > > > > > > I think a minimal set of RCU counters exposed to /proc or /sys should not
> > > > > > > > > hurt and could do more good than not. The scheduler already does this for
> > > > > > > > > scheduler statistics. I have seen Peter complain a lot about new tracepoints
> > > > > > > > > but not much (or never) about new statistics.
> > > > > > > > >
> > > > > > > > > Tracing has its strengths but may not apply well here IMO. I think a counter
> > > > > > > > > like this could be useful for tuning of things like the jiffies_*_sched_qs,
> > > > > > > > > the stall timeouts and also any other RCU knobs. What do you think?
> > > > > > > >
> > > > > > > > I prefer proc/sys knob for it to tracepoint. Why I've considered it is just it
> > > > > > > > looks like undoing what Paul did at ae91aa0ad.
> > > > > > > >
> > > > > > > > I think you're rational enough. I just wondered how Paul think of it.
> > > > > > >
> > > > > > > I believe at least initially, a set of statistics can be made
> > > > > > > available only when rcutorture or rcuperf module is loaded. That way
> > > > > > > they are purely only for debugging and nothing needs to be exposed to
> > > > > > > normal kernels distributed thus reducing testability concerns.
> > > > > > >
> > > > > > > rcu_state::gp_max would be trivial to expose through this, but for
> > > > > > > other statistics that are more complicated - perhaps
> > > > > > > tracepoint_probe_register can be used to add hooks on to the
> > > > > > > tracepoints and generate statistics from them. Again the registration
> > > > > > > of the probe and the probe handler itself would all be in
> > > > > > > rcutorture/rcuperf test code and not a part of the kernel proper.
> > > > > > > Thoughts?
> > > > > >
> > > > > > It still feels like you guys are hyperfocusing on this one particular
> > > > > > knob.  I instead need you to look at the interrelating knobs as a group.
> > > > >
> > > > > Thanks for the hints, we'll do that.
> > > > >
> > > > > > On the debugging side, suppose someone gives you an RCU bug report.
> > > > > > What information will you need?  How can you best get that information
> > > > > > without excessive numbers of over-and-back interactions with the guy
> > > > > > reporting the bug?  As part of this last question, what information is
> > > > > > normally supplied with the bug?  Alternatively, what information are
> > > > > > bug reporters normally expected to provide when asked?
> > > > >
> > > > > I suppose I could dig out some of our Android bug reports of the past where
> > > > > there were RCU issues but if there's any fires you are currently fighting do
> > > > > send it our way as debugging homework ;-)
> > > >
> > > > Evading the questions, are we?
> 
> Sorry if I sounded like evading, I was just saying that it would be nice to
> work on some specific issue or bug report and then figure out what is needed
> from there, instead of trying to predict what data is useful. Of course, I
> think predicting and dumping data is also useful, but I was just wondering if
> you had some specific issue in mind that we could work off of (you did
> mention the CPU stopper below so thank you!)

No worries, and fair enough!

> > > > OK, I can be flexible.  Suppose that you were getting RCU CPU stall
> > > > warnings featuring multi_cpu_stop() called from cpu_stopper_thread().
> > > > Of course, this really means that some other CPU/task is holding up
> > > > multi_cpu_stop() without also blocking the current grace period.
> > > >
> > > > What is the best way to work out what is really holding things up?
> > > 
> > > Either the stopper preempted another being in a critical section and
> > > has something wrong itself in case of PREEMPT or mechanisms for
> > > urgent control doesn't work correctly.
> > > 
> > > I don't know what exactly you intended but I would check things like
> > > (1) irq disable / eqs / tick / scheduler events and (2) whether special
> > > handling for each level of qs urgency has started correctly. For that
> > > purpose all the history of those events would be more useful.
> 
> Agreed, these are all good.

Just to reiterate -- it would be good if multi_cpu_stop() could
automatically dump out useful information when it has been delayed too
long, where "too long" is probably a bit shorter than the RCU CPU stall
warning timeout.  Useful information would include the stack of the task
impeding multi_cpu_stop().

> > > And with thinking it more, we could come up with a good way to
> > > make use of those data to identify what the problem is. Do I catch
> > > the point correctly? If so, me and Joel can start to work on it.
> > > Otherwise, please correct me.
> > 
> > I believe you are on the right track.  In short, it would be great if
> > the kernel would automatically dump out the needed information when
> > cpu_stopper gets stalled, sort of like RCU does (much of the time,
> > anyway) in its CPU stall warnings.  Given a patch that did this, I would
> > be quite happy to help advocate for it!
> 
> In case you have a LKML link to a thread or bug report to this specific
> cpu_stopper issue, please do pass it along.

I hope to have an rcutorture-based repeat-by in a few days, give or take.
(Famous last words!)

> Happy to work on this with Byungchul, thanks,

Thank you both!

							Thanx, Paul