[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190628202459.GD26519@linux.ibm.com>
Date: Fri, 28 Jun 2019 13:24:59 -0700
From: "Paul E. McKenney" <paulmck@...ux.ibm.com>
To: Scott Wood <swood@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Juri Lelli <juri.lelli@...hat.com>,
Clark Williams <williams@...hat.com>,
linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH RT 4/4] rcutorture: Avoid problematic critical
section nesting
On Fri, Jun 28, 2019 at 02:37:24PM -0500, Scott Wood wrote:
> On Thu, 2019-06-27 at 17:52 -0700, Paul E. McKenney wrote:
> > On Thu, Jun 27, 2019 at 05:46:27PM -0500, Scott Wood wrote:
> > > On Thu, 2019-06-27 at 13:50 -0700, Paul E. McKenney wrote:
> > > > If by IPI-to-self you mean the IRQ work trick, that isn't implemented
> > > > across all architectures yet, is it?
> > >
> > > Right... smp_send_reschedule() has wider coverage, but even then there's
> > > some hardware that just can't do it reasonably (e.g. pre-APIC x86).
> >
> > Except that smp_send_reschedule() won't do anything unless the scheduler
> > things something needs to be done, as it its wake list is non-empty.
> > Which might explain why Peter Zijlstra didn't suggest it.
>
> The wake list stuff is separate from the original purpose of the IPI, which
> is to hit the need_resched check on IRQ exit. When that happens, the
> scheduler will call into RCU, even if it doesn't change threads.
Got it, thank you!
> > > So I guess the options are:
> > >
> > > 1. Accept that such hardware might experience delayed grace period
> > > completion in certain configurations,
> > > 2. Have such hardware check for need_resched in local_irq_enable() (not
> > > nice
> > > if sharing a kernel build with hardware that doesn't need it), or
> > > 3. Forbid the sequence (enforced by debug checks). Again, this would
> > > only
> > > prohibit rcu_read_lock()/local_irq_disable()/rcu_read_unlock()/
> > > local_irq_enable() *without* preempt disabling around the IRQ-disabled
> > > region.
> >
> > 4. If further testing continues to show it to be reliable, continue
> > using the scheme in -rcu.
>
> If the testing isn't done on machines that can't do the IPI then it's
> basically option #1. FWIW I don't think option #1 is unreasonable given
> that we're talking about very old and/or specialized hardware, and we're
> only talking about delays, not a crash (maybe limit the ability to use
> nohz_full on such hardware?). Of course if it turns out people are actually
> trying to run (modern versions of) RT on such hardware, that might be
> different. :-)
Having tried and failed to remove DEC Alpha support several times, I
know which way to bet. Though DEC Alpha support is no longer much of a
burden on the non-Alpha portions of Linux, so no longer much motivation
for removing its support.
> > 5. Use a short-duration hrtimer to get a clean environment in short
> > order. Yes, the timer might fire while preemption and/or softirqs
> > are disabled, but then the code can rely on the following
> > preempt_enable(), local_bh_enable(), or whatever. This condition
> > should be sufficiently rare to avoid issues with hrtimer overhead.
>
> Yeah, I considered that but was hesitant due to overhead -- at least in the
> case of the example I gave (pre-APIC x86), arming a oneshot timer is pretty
> slow. Plus, some hardware might entirely lack one-shot timer capability.
The overhead is incurred in a rare case, and on systems lacking oneshot
timer it is always possible to fall back on normal timers, albeit with
fixed-time added delays. But yes, this does add a bit of complexity.
Alternatively, assuming this case is rare, normal timers might suffice
without the need for hrtimers.
> > 6. Use smp_call_function_single() to IPI some other poor slob of a
> > CPU, which then does the same back. Non-waiting version in both
> > cases, of course.
>
> I was assuming any hardware that can't do smp_send_reschedule() is not SMP.
I have no idea either way.
> > Probably others as well.
> >
> > > > Why not simply make rcutorture cyheck whether it is running in a
> > > > PREEMPT_RT_FULL environment and avoid the PREEMPT_RT_FULL-unfriendly
> > > > testing only in that case?
> > > >
> > > > And should we later get to a place where the PREEMPT_RT_FULL-
> > > > unfriendly
> > > > scenarios are prohibited across all kernel configurations, then the
> > > > module
> > > > parameter can be removed. Again, until we know (as opposed to
> > > > suspect)
> > > > that these scenarios really don't happen, mainline rcutorture must
> > > > continue testing them.
> > >
> > > Yes, I already acknowledged that debug checks detecting the sequences
> > > should
> > > come before the test removal
> >
> > OK, good to hear. As you may have noticed, I was getting the impression
> > that you might have changed your mind on this point. ;-)
> >
> > > (including this patch as an RFC at this
> > > point
> > > was mainly meant as a demonstration of what's needed to get rcutorture
> > > to
> > > pass), but it'd be nice to have some idea of whether there would be
> > > opposition to the concept before coding up the checks. I'd rather not
> > > continue the state of "these sequences can blow up on RT and we don't
> > > know
> > > if they exist or not" any longer than necessary. Plus, only one of the
> > > sequences is exclusively an RT issue (though it's the one with the worst
> > > consequences).
> >
> > Steve Rostedt's point about enlisting the aid of lockdep seems worth
> > looking into.
>
> Sure. I was just concerned by the "Linus was against enforcing this in the
> past" comment and was hoping for more details.
It is sometimes the case that showing that something never happens makes
people more comfortable with enforcing that something.
Thanx, Paul
Powered by blists - more mailing lists