linux-kernel - Re: [LKP] Re: [rcu] ed93dfc6bc: stress-ng.icache.ops_per

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191210141030.GP32275@shao2-debian>
Date:   Tue, 10 Dec 2019 22:10:30 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        lkp@...ts.01.org
Subject: Re: [LKP] Re: [rcu] ed93dfc6bc: stress-ng.icache.ops_per_sec -15.0%
 regression

On Mon, Dec 09, 2019 at 04:57:08PM -0800, Paul E. McKenney wrote:
> On Mon, Dec 09, 2019 at 05:10:11PM +0800, Rong Chen wrote:
> > 
> > 
> > On 12/9/19 2:13 AM, Paul E. McKenney wrote:
> > > On Sun, Dec 08, 2019 at 11:29:07PM +0800, kernel test robot wrote:
> > > > Greeting,
> > > > 
> > > > FYI, we noticed a -15.0% regression of stress-ng.icache.ops_per_sec due to commit:
> > > > 
> > > > 
> > > > commit: ed93dfc6bc0084485ccad1ff6bd2ea81ab2c03cd ("rcu: Confine ->core_needs_qs accesses to the corresponding CPU")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > 
> > > > in testcase: stress-ng
> > > > on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
> > > > with following parameters:
> > > > 
> > > > 	nr_threads: 100%
> > > > 	disk: 1HDD
> > > > 	testtime: 1s
> > > > 	class: cpu-cache
> > > > 	cpufreq_governor: performance
> > > > 	ucode: 0x500002c
> > > > 
> > > > 
> > > > 
> > > > 
> > > > If you fix the issue, kindly add following tag
> > > > Reported-by: kernel test robot <rong.a.chen@...el.com>
> > > > 
> > > > 
> > > > Details are as below:
> > > > -------------------------------------------------------------------------------------------------->
> > > > 
> > > > 
> > > > To reproduce:
> > > > 
> > > >          git clone https://github.com/intel/lkp-tests.git
> > > >          cd lkp-tests
> > > >          bin/lkp install job.yaml  # job file is attached in this email
> > > >          bin/lkp run     job.yaml
> > > > 
> > > > =========================================================================================
> > > > class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
> > > >    cpu-cache/gcc-7/performance/1HDD/x86_64-rhel-7.6/100%/debian-x86_64-2019-11-14.cgz/lkp-csl-2sp5/stress-ng/1s/0x500002c
> > > > 
> > > > commit:
> > > >    516e5ae0c9 ("rcu: Reset CPU hints when reporting a quiescent state")
> > > >    ed93dfc6bc ("rcu: Confine ->core_needs_qs accesses to the corresponding CPU")
> > > > 
> > > > 516e5ae0c9401629 ed93dfc6bc0084485ccad1ff6bd
> > > > ---------------- ---------------------------
> > > >           %stddev     %change         %stddev
> > > >               \          |                \
> > > >       39049           -15.0%      33189 ± 14%  stress-ng.icache.ops_per_sec
> > > I have to ask...
> > > 
> > > Given a 14% standard deviation, is this 15% change statistically
> > > significant?
> > > 
> > > On the other hand, if this is due to lengthened grace periods, which
> > > are a known side-effect of this commit, there are speedups for that
> > > coming down the pike.
> > 
> > Hi Paul,
> > 
> > We run the test more times and stress-ng.icache.ops_per_sec is unstable:
> > 
> > 516e5ae0c9401629  ed93dfc6bc0084485ccad1ff6b
> > ----------------  -------------------------- ---------------------------
> >          %stddev      change         %stddev
> >                    \              |                         \
> >      37409 ±  6%        -4%      35958 ± 11%
> > stress-ng/cpu-cache-performance-1HDD-100%-1s-ucode=0x500002c/lkp-csl-2sp5
> > 
> > 
> > > 							Thanx, Paul
> > > 
> > > >        7784           -36.6%       4939 ±  9%  stress-ng.membarrier.ops
> > > >        7648           -37.3%       4793 ±  9%  stress-ng.membarrier.ops_per_sec
> > 
> > there is still a regression of stress-ng.membarrier.ops_per_sec:
> > 
> > 
> > 516e5ae0c9401629  ed93dfc6bc0084485ccad1ff6b
> > ----------------  -------------------------- ---------------------------
> >          %stddev      change         %stddev
> >                    \              |                      \
> >       7522               -37%       4744 ±  9%
> > stress-ng/cpu-cache-performance-1HDD-100%-1s-ucode=0x500002c/lkp-csl-2sp5
> 
> OK, and it is much harder for me to argue that this one is statistically
> insignificant.  From the softirq stats below, I got too aggressive about
> removing instances of "rdp->core_needs_qs = false".  Does the (lightly
> tested) commit below help?
> 
> 							Thanx, Paul

Hi Paul,

The patch can fix the regression of stress-ng.membarrier.ops_per_sec:

commit:
  516e5ae0c9 ("rcu: Reset CPU hints when reporting a quiescent state")
  ed93dfc6bc ("rcu: Confine ->core_needs_qs accesses to the corresponding CPU")
  8b3a369c70 ("rcu: Clear ->core_needs_qs at GP end or self-reported QS")

516e5ae0c9401629  ed93dfc6bc0084485ccad1ff6b  8b3a369c70986933c24efdf674  testcase/testparams/testbox
----------------  --------------------------  --------------------------  ---------------------------
         %stddev      change         %stddev      change         %stddev
             \          |                \          |                \  
      7522             -37%       4744 ±  9%                  7392 ±  3%  stress-ng/cpu-cache-performance-1HDD-100%-1s-ucode=0x500002c/lkp-csl-2sp5
      7522             -37%       4744                        7392        GEO-MEAN stress-ng.membarrier.ops_per_sec

Best Regards,
Rong Chen

> 
> ------------------------------------------------------------------------
> 
> commit 57412364506cf5262a7fdffa4718bf39b8891940
> Author: Paul E. McKenney <paulmck@...nel.org>
> Date:   Mon Dec 9 15:19:45 2019 -0800
> 
>     rcu: Clear ->core_needs_qs at GP end or self-reported QS
>     
>     The rcu_data structure's ->core_needs_qs field does not necessarily get
>     cleared in a timely fashion after the corresponding CPUs' quiescent state
>     has been reported.  From a functional viewpoint, no harm done, but this
>     can result in excessive invocation of RCU core processing, as witnessed
>     by the kernel test robot, which saw greatly increased softirq overhead.
>     
>     This commit therefore restores the rcu_report_qs_rdp() function's
>     clearing of this field, but only when running on the corresponding CPU.
>     Cases where some other CPU reports the quiescent state (for example, on
>     behalf of an idle CPU) are handled by setting this field appropriately
>     within the __note_gp_changes() function's end-of-grace-period checks.
>     This handling is carried out regardless of whether the end of a grace
>     period actually happened, thus handling the case where a CPU goes non-idle
>     after a quiescent state is reported on its behalf, but before the grace
>     period ends.  This fix also avoids cross-CPU updates to ->core_needs_qs,
>     
>     While in the area, this commit changes the __note_gp_changes() need_gp
>     variable's name to need_qs because it is a quiescent state that is needed
>     from the CPU in question.
>     
>     Fixes: ed93dfc6bc00 ("rcu: Confine ->core_needs_qs accesses to the corresponding CPU")
>     Reported-by: kernel test robot <rong.a.chen@...el.com>
>     Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index f555ea9..1d0935e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -1393,7 +1393,7 @@ static void __maybe_unused rcu_advance_cbs_nowake(struct rcu_node *rnp,
>  static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
>  {
>  	bool ret = false;
> -	bool need_gp;
> +	bool need_qs;
>  	const bool offloaded = IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
>  			       rcu_segcblist_is_offloaded(&rdp->cblist);
>  
> @@ -1407,10 +1407,13 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
>  	    unlikely(READ_ONCE(rdp->gpwrap))) {
>  		if (!offloaded)
>  			ret = rcu_advance_cbs(rnp, rdp); /* Advance CBs. */
> +		rdp->core_needs_qs = false;
>  		trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("cpuend"));
>  	} else {
>  		if (!offloaded)
>  			ret = rcu_accelerate_cbs(rnp, rdp); /* Recent CBs. */
> +		if (rdp->core_needs_qs)
> +			rdp->core_needs_qs = !!(rnp->qsmask & rdp->grpmask);
>  	}
>  
>  	/* Now handle the beginnings of any new-to-this-CPU grace periods. */
> @@ -1422,9 +1425,9 @@ static bool __note_gp_changes(struct rcu_node *rnp, struct rcu_data *rdp)
>  		 * go looking for one.
>  		 */
>  		trace_rcu_grace_period(rcu_state.name, rnp->gp_seq, TPS("cpustart"));
> -		need_gp = !!(rnp->qsmask & rdp->grpmask);
> -		rdp->cpu_no_qs.b.norm = need_gp;
> -		rdp->core_needs_qs = need_gp;
> +		need_qs = !!(rnp->qsmask & rdp->grpmask);
> +		rdp->cpu_no_qs.b.norm = need_qs;
> +		rdp->core_needs_qs = need_qs;
>  		zero_cpu_stall_ticks(rdp);
>  	}
>  	rdp->gp_seq = rnp->gp_seq;  /* Remember new grace-period state. */
> @@ -2000,6 +2003,8 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp)
>  		return;
>  	}
>  	mask = rdp->grpmask;
> +	if (rdp->cpu == smp_processor_id())
> +		rdp->core_needs_qs = false;
>  	if ((rnp->qsmask & mask) == 0) {
>  		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
>  	} else {
> _______________________________________________
> LKP mailing list -- lkp@...ts.01.org
> To unsubscribe send an email to lkp-leave@...ts.01.org