lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mti7s9xz.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date:   Thu, 03 Mar 2022 16:43:20 +0800
From:   "Huang, Ying" <ying.huang@...el.com>
To:     kernel test robot <oliver.sang@...el.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>, <x86@...nel.org>,
        <lkp@...ts.01.org>, <lkp@...el.com>, <feng.tang@...el.com>,
        <zhengjun.xing@...ux.intel.com>, <fengwei.yin@...el.com>,
        <aubrey.li@...ux.intel.com>, <yu.c.chen@...el.com>
Subject: Re: [sched/numa]  0fb3978b0a:  stress-ng.fstat.ops_per_sec -18.9%
 regression

Hi, Oliver,

Thanks for report.

I still cannot connect the regression with the patch yet.  To double
check, I have run test again with "sched_verbose" kernel command line,
and verified that the sched_domain isn't changed at all with the patch.

kernel test robot <oliver.sang@...el.com> writes:
>       0.11   6%      +0.1        0.16   4%  perf-profile.self.cycles-pp.update_rq_clock
>       0.00            +0.1        0.06   6%  perf-profile.self.cycles-pp.memset_erms
>       0.00            +0.1        0.07   5%  perf-profile.self.cycles-pp.get_pid_task
>       0.06   7%      +0.1        0.17   6%  perf-profile.self.cycles-pp.select_task_rq_fair
>       0.54   5%      +0.1        0.68        perf-profile.self.cycles-pp.lockref_put_return
>       4.26            +1.1        5.33        perf-profile.self.cycles-pp.common_perm_cond
>      15.45            +4.9       20.37        perf-profile.self.cycles-pp.lockref_put_or_lock
>      20.12            +6.7       26.82        perf-profile.self.cycles-pp.lockref_get_not_dead

>From the perf-profile above, the most visible change is more cycles in
lockref_get_not_dead(), which will loop with cmpxchg on
dentry->d_lockref.  So this appears to be related to the memory layout.
I will try to debug that.

Because stress-ng is a weird "benchmark" although it's a very good
functionality test, and I cannot connect the patch with the test case
and performance metrics collected.  I think this regression should be a
low priority one which shouldn't prevent the merging etc.  But I will
continue to investigate the regression to try to root cause it.

Best Regards,
Huang, Ying

Powered by blists - more mailing lists