[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20181015195426.GD2674@linux.ibm.com>
Date:   Mon, 15 Oct 2018 12:54:26 -0700
From:   "Paul E. McKenney" <paulmck@...ux.ibm.com>
To:     Joel Fernandes <joel@...lfernandes.org>
Cc:     Nikolay Borisov <nborisov@...e.com>, linux-kernel@...r.kernel.org,
        Jonathan Corbet <corbet@....net>,
        Josh Triplett <josh@...htriplett.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        linux-doc@...r.kernel.org,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about
 disabling preemption
On Mon, Oct 15, 2018 at 12:39:51PM -0700, Joel Fernandes wrote:
> On Mon, Oct 15, 2018 at 04:21:12AM -0700, Paul E. McKenney wrote:
> > On Mon, Oct 15, 2018 at 09:05:22AM +0300, Nikolay Borisov wrote:
> > > On 15.10.2018 05:47, Joel Fernandes wrote:
> > > > On Sun, Oct 14, 2018 at 07:33:28PM -0700, Paul E. McKenney wrote:
> > > >> On Sun, Oct 14, 2018 at 07:13:49PM -0700, Joel Fernandes wrote:
> > > >>> On Sun, Oct 14, 2018 at 07:08:27PM -0700, Joel Fernandes wrote:
> > > >>>> On Sun, Oct 14, 2018 at 04:17:31PM -0700, Paul E. McKenney wrote:
> > > >>>>> On Sun, Oct 14, 2018 at 02:29:55PM -0700, Joel Fernandes (Google) wrote:
> > > >>>>>> The Requirements.html document says "Disabling Preemption Does Not Block
> > > >>>>>> Grace Periods". However this is no longer true with the RCU
> > > >>>>>> consolidation. Lets remove the obsolete (non-)requirement entirely.
> > > >>>>>>
> > > >>>>>> Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
> > > >>>>>
> > > >>>>> Good catch, queued, thank you!
> > > >>>>
> > > >>>> Thanks! By the way after I sent the patch, I also tried Oleg's experiment to
> > > >>>> confirm that this is indeed obsolete.  :)
> > > >>>>
> > > >>>> One thing interesting came up when I tried synchronize_rcu_expedited()
> > > >>>> instead of synchronize_rcu() in Oleg's experiment, I still saw a multiple
> > > >>>> millisecond delay between when the rcu read section completely and the
> > > >>>> synchronize_rcu_expedited returns:
> > > >>>>
> > > >>>> For example, with synchronize_rcu_expedited, the 'SPIN done' and the 'SYNC
> > > >>>> done' are about 3 millisecond apart:
> > > >>>> [   77.599142] SPIN start
> > > >>>> [   77.601595] SYNC start
> > > >>>> [   82.604950] SPIN done!
> > > >>>> [   82.607836] SYNC done!
> > > >>>>  I saw anywhere from 2-6 milliseconds.
> > > >>>>
> > > >>>> The reason I bring this up is according to Requirements.html: In some cases,
> > > >>>> the multi-millisecond synchronize_rcu() latencies are unacceptable. In these
> > > >>>> cases, synchronize_rcu_expedited() may be used instead,.. so either I messed
> > > >>>> something up in the experiment, or I need to update this part of the document ;-)
> > > >>
> > > >> In normal testing, 2-6 milliseconds is indeed excessive.  Could you please
> > > >> point me at Oleg's experiment?  Also, what CONFIG_PREEMPT setting were
> > > >> you using?  (My guess is CONFIG_PREEMPT=y.)
> > > > 
> > > > The CONFIG_PREEMPT config I am using is CONFIG_PREEMPT=y.
> > > > 
> > > >>> So I realized I'm running in Qemu so it could also be a scheduling delay of
> > > >>> the vcpu thread. So apologies about the noise if the experiment works fine
> > > >>> for you.
> > > >>
> > > >> I used rcuperf, which might not be doing the same thing as Oleg's
> > > >> experiment.
> > > > 
> > > > The experiment is mentioned at:
> > > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg912055.html
> > > > 
> > > > If you apply the below diff, it applies cleanly on rcu/dev. And then run:
> > > > taskset 2 perl -e 'syscall 157, 666, 5000' &
> > > > taskset 1 perl -e 'syscall 157, 777'
> > > > 
> > > > diff --git a/kernel/sys.c b/kernel/sys.c
> > > > index cf5c67533ff1..b654b7566ca3 100644
> > > > --- a/kernel/sys.c
> > > > +++ b/kernel/sys.c
> > > > @@ -2261,6 +2261,9 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which,
> > > >  	return -EINVAL;
> > > >  }
> > > >  
> > > > +#include <linux/delay.h>
> > > > +
> > > > +
> > > >  SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
> > > >  		unsigned long, arg4, unsigned long, arg5)
> > > >  {
> > > > @@ -2274,6 +2277,19 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
> > > >  
> > > >  	error = 0;
> > > >  	switch (option) {
> > > > +	case 666:
> > > > +		preempt_disable();
> > > > +		pr_crit("SPIN start\n");
> > > > +		while (arg2--)
> > > > +			mdelay(1);
> > 
> > OK, this is the problem.  When you spin in the kernel for several
> > milliseconds with preemption disabled, the consolidated grace period
> > is -required- to wait for this preemption-disabled reader to complete,
> > whether expedited or not.
> > 
> > So, expected behavior.  ;-)
> 
> Cool. Thanks for confirming. I ran some tests too and if I reduce the
> preempt_disabled section's duration, then the delay for
> synchronize_rcu_expedited is much lesser indeed.
Good to hear!
> > In any case, please don't spin for milliseconds with preemption disabled.
> > The real-time guys are unlikely to be happy with you if you do this!
> 
> Well just to clarify, I was just running Oleg's test which did this. This
> test was mentioned in the original documentation that I deleted. Ofcourse I
> would not dare do such a thing in production code :-D. I guess to Oleg's
> defense, he did it to very that synchronize_rcu() was not blocked on
> preempt-disable sections which was a different test.
Understood!  Just pointing out that RCU's tolerating a given action does
not necessarily mean that it is a good idea to take that action.  ;-)
> > > > +		pr_crit("SPIN done!\n");
> > > > +		preempt_enable();
> > > > +		break;
> > > > +	case 777:
> > > > +		pr_crit("SYNC start\n");
> > > > +		synchronize_rcu();
> > > > +		pr_crit("SYNC done!\n");
> > > 
> > > But you are using the console printing infrastructure which is rather
> > > heavyweight. Try replacing pr_* calls with trace_printk so that you
> > > write to the lock-free ring buffer, this will reduce the noise from the
> > > heavy console printing infrastructure.
> > 
> > And this might be a problem as well.
> 
> This was not the issue (or atleast not fully the issue) since I saw the same
> thing with trace_printk. It was exactly what you said - which is the
> excessively long preempt disabled times.
One approach would be to apply this patch against (say) v4.18, which
does not have consolidated grace periods.  You might then be able to
tell if the pr_crit() calls make any difference.
							Thanx, Paul
Powered by blists - more mailing lists
 
