linux-kernel - Re: [LKP] Re: [sched/numa] 0fb3978b0a: stress-ng.fstat.ops_per

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20220309112404.GH15701@techsingularity.net>
Date:   Wed, 9 Mar 2022 11:24:04 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        kernel test robot <oliver.sang@...el.com>,
        LKML <linux-kernel@...r.kernel.org>, x86@...nel.org,
        lkp@...ts.01.org, lkp@...el.com, fengwei.yin@...el.com,
        aubrey.li@...ux.intel.com, yu.c.chen@...el.com
Subject: Re: [LKP] Re: [sched/numa]  0fb3978b0a: stress-ng.fstat.ops_per_sec
 -18.9% regression

On Wed, Mar 09, 2022 at 05:28:55PM +0800, Huang, Ying wrote:
> Hi, All,
> 
> "Huang, Ying" <ying.huang@...el.com> writes:
> 
> > Hi, Oliver,
> >
> > Thanks for report.
> >
> > I still cannot connect the regression with the patch yet.  To double
> > check, I have run test again with "sched_verbose" kernel command line,
> > and verified that the sched_domain isn't changed at all with the patch.
> >
> > kernel test robot <oliver.sang@...el.com> writes:
> >>       0.11   6%      +0.1        0.16   4%  perf-profile.self.cycles-pp.update_rq_clock
> >>       0.00            +0.1        0.06   6%  perf-profile.self.cycles-pp.memset_erms
> >>       0.00            +0.1        0.07   5%  perf-profile.self.cycles-pp.get_pid_task
> >>       0.06   7%      +0.1        0.17   6%  perf-profile.self.cycles-pp.select_task_rq_fair
> >>       0.54   5%      +0.1        0.68        perf-profile.self.cycles-pp.lockref_put_return
> >>       4.26            +1.1        5.33        perf-profile.self.cycles-pp.common_perm_cond
> >>      15.45            +4.9       20.37        perf-profile.self.cycles-pp.lockref_put_or_lock
> >>      20.12            +6.7       26.82        perf-profile.self.cycles-pp.lockref_get_not_dead
> >
> > From the perf-profile above, the most visible change is more cycles in
> > lockref_get_not_dead(), which will loop with cmpxchg on
> > dentry->d_lockref.  So this appears to be related to the memory layout.
> > I will try to debug that.
> >
> > Because stress-ng is a weird "benchmark" although it's a very good
> > functionality test, and I cannot connect the patch with the test case
> > and performance metrics collected.  I think this regression should be a
> > low priority one which shouldn't prevent the merging etc.  But I will
> > continue to investigate the regression to try to root cause it.
> 
> Done more investigation for this.  It turns out the sched_domain has
> been changed after commit 0fb3978b0a, although it's not shown in default
> sched_verbose output.  sd->imb_numa_nr of level "NUMA" has been changed
> from 24 to 12 after the commit.  So the following debug patch restore
> the performance.
> 

If Ice Lake has multiple last level caches per socket (I didn't check)
then the sd->imb_numa_nr would have changed. I didn't dig into what
stress-ng fstat is doing as it's a stress test more than a performance
test but given that the number of threads is 10% of the total, it's
possible that the workload is being split across nodes differently.

-- 
Mel Gorman
SUSE Labs