[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202306071452.887afdea-oliver.sang@intel.com>
Date: Wed, 7 Jun 2023 14:32:17 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
<linux-kernel@...r.kernel.org>, <ying.huang@...el.com>,
<feng.tang@...el.com>, <fengwei.yin@...el.com>,
<oliver.sang@...el.com>
Subject: [linus:master] [x86] 20f3337d35: stress-ng.lockofd.ops_per_sec
3.4% improvement
Hello,
kernel test robot noticed a 3.4% improvement of stress-ng.lockofd.ops_per_sec on:
commit: 20f3337d350c4e1b4ac66d731fd4e98565bf6cc0 ("x86: don't use REP_GOOD or ERMS for small memory clearing")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 10%
disk: 1HDD
testtime: 60s
fs: ext4
class: os
test: lockofd
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
os/gcc-12/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp7/lockofd/stress-ng/60s
commit:
68674f94ff ("x86: don't use REP_GOOD or ERMS for small memory copies")
20f3337d35 ("x86: don't use REP_GOOD or ERMS for small memory clearing")
68674f94ffc9dddc 20f3337d350c4e1b4ac66d731fd
---------------- ---------------------------
%stddev %change %stddev
\ | \
39.25 +2.2% 40.10 iostat.cpu.user
0.05 -0.0 0.03 ± 16% mpstat.cpu.all.soft%
1.212e+08 +3.4% 1.253e+08 stress-ng.lockofd.ops
2019682 +3.4% 2087883 stress-ng.lockofd.ops_per_sec
9.311e+08 +6.8% 9.944e+08 perf-stat.i.branch-instructions
0.67 -7.1% 0.63 perf-stat.i.cpi
1.172e+09 +2.5% 1.202e+09 perf-stat.i.dTLB-loads
0.00 ± 3% -0.0 0.00 ± 2% perf-stat.i.dTLB-store-miss-rate%
8.309e+08 +15.2% 9.575e+08 perf-stat.i.dTLB-stores
4.579e+09 +7.8% 4.934e+09 perf-stat.i.instructions
1.49 +7.8% 1.60 perf-stat.i.ipc
1.16 +62.7% 1.89 ± 2% perf-stat.i.metric.G/sec
1771 -28.8% 1261 ± 4% perf-stat.i.metric.M/sec
19.22 ± 2% -1.8 17.39 perf-profile.calltrace.cycles-pp.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
53.24 -1.8 51.44 perf-profile.calltrace.cycles-pp.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
60.31 -1.5 58.83 perf-profile.calltrace.cycles-pp.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.48 ± 4% -0.6 5.87 ± 4% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
1.03 ± 7% -0.2 0.88 ± 10% perf-profile.calltrace.cycles-pp.flock64_to_posix_lock.fcntl_getlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
5.76 ± 3% +0.4 6.12 ± 2% perf-profile.calltrace.cycles-pp.stress_mwc64modn
0.00 +1.5 1.51 ± 4% perf-profile.calltrace.cycles-pp.memset_orig.kmem_cache_alloc.fcntl_setlk.do_fcntl.__x64_sys_fcntl
0.00 +1.6 1.60 ± 7% perf-profile.calltrace.cycles-pp.memset_orig.kmem_cache_alloc.fcntl_getlk.do_fcntl.__x64_sys_fcntl
53.94 -1.7 52.24 perf-profile.children.cycles-pp.do_fcntl
19.65 -1.6 18.07 perf-profile.children.cycles-pp.fcntl_setlk
13.82 -1.0 12.83 ± 3% perf-profile.children.cycles-pp.kmem_cache_alloc
5.96 ± 3% +0.4 6.37 ± 3% perf-profile.children.cycles-pp.stress_mwc64modn
5.88 ± 2% +0.5 6.39 ± 3% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
0.00 +3.3 3.29 ± 5% perf-profile.children.cycles-pp.memset_orig
0.84 ± 9% +0.2 0.99 ± 9% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
5.60 ± 2% +0.4 6.04 ± 3% perf-profile.self.cycles-pp.stress_mwc64modn
5.58 ± 2% +0.5 6.09 ± 2% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
0.00 +3.1 3.11 ± 5% perf-profile.self.cycles-pp.memset_orig
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
View attachment "config-6.3.0-rc7-00002-g20f3337d350c" of type "text/plain" (158337 bytes)
View attachment "job-script" of type "text/plain" (9238 bytes)
View attachment "job.yaml" of type "text/plain" (6216 bytes)
View attachment "reproduce" of type "text/plain" (535 bytes)
Powered by blists - more mailing lists