[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191210141030.GP32275@shao2-debian>
Date: Tue, 10 Dec 2019 22:10:30 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
lkp@...ts.01.org
Subject: Re: [LKP] Re: [rcu] ed93dfc6bc: stress-ng.icache.ops_per_sec -15.0%
regression
On Mon, Dec 09, 2019 at 04:57:08PM -0800, Paul E. McKenney wrote:
> On Mon, Dec 09, 2019 at 05:10:11PM +0800, Rong Chen wrote:
> >
> >
> > On 12/9/19 2:13 AM, Paul E. McKenney wrote:
> > > On Sun, Dec 08, 2019 at 11:29:07PM +0800, kernel test robot wrote:
> > > > Greeting,
> > > >
> > > > FYI, we noticed a -15.0% regression of stress-ng.icache.ops_per_sec due to commit:
> > > >
> > > >
> > > > commit: ed93dfc6bc0084485ccad1ff6bd2ea81ab2c03cd ("rcu: Confine ->core_needs_qs accesses to the corresponding CPU")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > >
> > > > in testcase: stress-ng
> > > > on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
> > > > with following parameters:
> > > >
> > > > nr_threads: 100%
> > > > disk: 1HDD
> > > > testtime: 1s
> > > > class: cpu-cache
> > > > cpufreq_governor: performance
> > > > ucode: 0x500002c
> > > >
> > > >
> > > >
> > > >
> > > > If you fix the issue, kindly add following tag
> > > > Reported-by: kernel test robot <rong.a.chen@...el.com>
> > > >
> > > >
> > > > Details are as below:
> > > > -------------------------------------------------------------------------------------------------->
> > > >
> > > >
> > > > To reproduce:
> > > >
> > > > git clone https://github.com/intel/lkp-tests.git
> > > > cd lkp-tests
> > > > bin/lkp install job.yaml # job file is attached in this email
> > > > bin/lkp run job.yaml
> > > >
> > > > =========================================================================================
> > > > class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
> > > > cpu-cache/gcc-7/performance/1HDD/x86_64-rhel-7.6/100%/debian-x86_64-2019-11-14.cgz/lkp-csl-2sp5/stress-ng/1s/0x500002c
> > > >
> > > > commit:
> > > > 516e5ae0c9 ("rcu: Reset CPU hints when reporting a quiescent state")
> > > > ed93dfc6bc ("rcu: Confine ->core_needs_qs accesses to the corresponding CPU")
> > > >
> > > > 516e5ae0c9401629 ed93dfc6bc0084485ccad1ff6bd
> > > > ---------------- ---------------------------
> > > > %stddev %change %stddev
> > > > \ | \
> > > > 39049 -15.0% 33189 ± 14% stress-ng.icache.ops_per_sec
> > > I have to ask...
> > >
> > > Given a 14% standard deviation, is this 15% change statistically
> > > significant?
> > >
> > > On the other hand, if this is due to lengthened grace periods, which
> > > are a known side-effect of this commit, there are speedups for that
> > > coming down the pike.
> >
> > Hi Paul,
> >
> > We run the test more times and stress-ng.icache.ops_per_sec is unstable:
> >
> > 516e5ae0c9401629 ed93dfc6bc0084485ccad1ff6b
> > ---------------- -------------------------- ---------------------------
> > %stddev change %stddev
> > \ | \
> > 37409 ± 6% -4% 35958 ± 11%
> > stress-ng/cpu-cache-performance-1HDD-100%-1s-ucode=0x500002c/lkp-csl-2sp5
> >
> >
> > > Thanx, Paul
> > >
> > > > 7784 -36.6% 4939 ± 9% stress-ng.membarrier.ops
> > > > 7648 -37.3% 4793 ± 9% stress-ng.membarrier.ops_per_sec
> >
> > there is still a regression of stress-ng.membarrier.ops_per_sec:
> >
> >
> > 516e5ae0c9401629 ed93dfc6bc0084485ccad1ff6b
> > ---------------- -------------------------- ---------------------------
> > %stddev change %stddev
> > \ | \
> > 7522 -37% 4744 ± 9%
> > stress-ng/cpu-cache-performance-1HDD-100%-1s-ucode=0x500002c/lkp-csl-2sp5
>
> OK, and it is much harder for me to argue that this one is statistically
> insignificant. From the softirq stats below, I got too aggressive about
> removing instances of "rdp->core_needs_qs = false". Does the (lightly
> tested) commit below help?
>
> Thanx, Paul
Hi Paul,
The patch can fix the regression of stress-ng.membarrier.ops_per_sec:
commit:
516e5ae0c9 ("rcu: Reset CPU hints when reporting a quiescent state")
ed93dfc6bc ("rcu: Confine ->core_needs_qs accesses to the corresponding CPU")
8b3a369c70 ("rcu: Clear ->core_needs_qs at GP end or self-reported QS")
516e5ae0c9401629 ed93dfc6bc0084485ccad1ff6b 8b3a369c70986933c24efdf674 testcase/testparams/testbox
---------------- -------------------------- -------------------------- ---------------------------
%stddev change %stddev change %stddev
\ | \ | \
7522 -37% 4744 ± 9% 7392 ± 3% stress-ng/cpu-cache-performance-1HDD-100%-1s-ucode=0x500002c/lkp-csl-2sp5
7522 -37% 4744 7392 GEO-MEAN stress-ng.membarrier.ops_per_sec
Best Regards,
Rong Chen
>
> ------------------------------------------------------------------------
>
> commit 57412364506cf5262a7fdffa4718bf39b8891940
> Author: Paul E. McKenney <paulmck@...nel.org>
> Date: Mon Dec 9 15:19:45 2019 -0800
>
> rcu: Clear ->core_needs_qs at GP end or self-reported QS
>
> The rcu_data structure's ->core_needs_qs field does not necessarily get
> cleared in a timely fashion after the corresponding CPUs' quiescent state
> has been reported. From a functional viewpoint, no harm done, but this
> can result in excessive invocation of RCU core processing, as witnessed
> by the kernel test robot, which saw greatly increased softirq overhead.
>
> This commit therefore restores the rcu_report_qs_rdp() function's
> clearing of this field, but only when running on the corresponding CPU.
> Cases where some other CPU reports the quiescent state (for example, on
> behalf of an idle CPU) are handled by setting this field appropriately
> within the __note_gp_changes() function's end-of-grace-period checks.
> This handling is carried out regardless of whether the end of a grace
> period actually happened, thus handling the case where a CPU goes non-idle
> after a quiescent state is reported on its behalf, but before the grace
> period ends. This fix also avoids cross-CPU updates to ->core_needs_qs,
>
> While in the area, this commit changes the __note_gp_changes() need_gp
> variable's name to need_qs because it is a quiescent state that is needed
> from the CPU in question.
>
> Fixes: ed93dfc6bc00 ("rcu: Confine ->core_needs_qs accesses to the corresponding CPU")
> Reported-by: kernel test robot <rong.a.chen@...el.com>
> Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f555ea9..1d0935e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1393,7 +1393,7 @@ static void __maybe_unused rcu_advance_cbs_nowake(struct rcu_node *rnp,
> static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
> {
> bool ret = false;
> - bool need_gp;
> + bool need_qs;
> const bool offloaded = IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
> rcu_segcblist_is_offloaded(&rdp->cblist);
>
> @@ -1407,10 +1407,13 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
> unlikely(READ_ONCE(rdp->gpwrap))) {
> if (!offloaded)
> ret = rcu_advance_cbs(rnp, rdp); /* Advance CBs. */
> + rdp->core_needs_qs = false;
> trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("cpuend"));
> } else {
> if (!offloaded)
> ret = rcu_accelerate_cbs(rnp, rdp); /* Recent CBs. */
> + if (rdp->core_needs_qs)
> + rdp->core_needs_qs = !!(rnp->qsmask & rdp->grpmask);
> }
>
> /* Now handle the beginnings of any new-to-this-CPU grace periods. */
> @@ -1422,9 +1425,9 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
> * go looking for one.
> */
> trace_rcu_grace_period(rcu_state.name, rnp->gp_seq, TPS("cpustart"));
> - need_gp = !!(rnp->qsmask & rdp->grpmask);
> - rdp->cpu_no_qs.b.norm = need_gp;
> - rdp->core_needs_qs = need_gp;
> + need_qs = !!(rnp->qsmask & rdp->grpmask);
> + rdp->cpu_no_qs.b.norm = need_qs;
> + rdp->core_needs_qs = need_qs;
> zero_cpu_stall_ticks(rdp);
> }
> rdp->gp_seq = rnp->gp_seq; /* Remember new grace-period state. */
> @@ -2000,6 +2003,8 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp)
> return;
> }
> mask = rdp->grpmask;
> + if (rdp->cpu == smp_processor_id())
> + rdp->core_needs_qs = false;
> if ((rnp->qsmask & mask) == 0) {
> raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
> } else {
> _______________________________________________
> LKP mailing list -- lkp@...ts.01.org
> To unsubscribe send an email to lkp-leave@...ts.01.org
Powered by blists - more mailing lists