[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202410161557.5b87225e-oliver.sang@intel.com>
Date: Wed, 16 Oct 2024 15:27:13 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
<linux-fsdevel@...r.kernel.org>, <ying.huang@...el.com>,
<feng.tang@...el.com>, <fengwei.yin@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [x86] 2865baf540:
stress-ng.access.access_calls_per_sec 6.8% improvement
Hello,
kernel test robot noticed a 6.8% improvement of stress-ng.access.access_calls_per_sec on:
commit: 2865baf54077aa98fcdb478cefe6a42c417b9374 ("x86: support user address masking instead of non-speculative conditional")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: stress-ng
config: x86_64-rhel-8.3
compiler: gcc-12
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:
nr_threads: 100%
disk: 1HDD
testtime: 60s
fs: btrfs
test: access
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241016/202410161557.5b87225e-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/1HDD/btrfs/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/access/stress-ng/60s
commit:
v6.10
2865baf540 ("x86: support user address masking instead of non-speculative conditional")
v6.10 2865baf54077aa98fcdb478cefe
---------------- ---------------------------
%stddev %change %stddev
\ | \
1008 ± 35% -45.4% 550.53 ± 74% numa-meminfo.node0.Inactive(file)
100.41 ± 55% -63.1% 37.01 ± 70% perf-sched.wait_and_delay.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
3373715 +6.8% 3603928 stress-ng.access.access_calls_per_sec
252.58 ± 35% -45.5% 137.68 ± 74% numa-vmstat.node0.nr_inactive_file
252.58 ± 35% -45.5% 137.68 ± 74% numa-vmstat.node0.nr_zone_inactive_file
4.08 +3.5% 4.23 perf-stat.i.cpi
4.10 +3.2% 4.24 perf-stat.overall.cpi
0.24 -3.1% 0.24 perf-stat.overall.ipc
3.326e+12 -3.2% 3.22e+12 perf-stat.total.instructions
2.33 ± 5% -0.2 2.10 ± 4% perf-profile.calltrace.cycles-pp.syscall
1.85 ± 5% -0.2 1.63 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.faccessat
1.86 ± 5% -0.2 1.65 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall
1.76 ± 5% -0.2 1.55 ± 4% perf-profile.calltrace.cycles-pp.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.faccessat
1.83 ± 5% -0.2 1.62 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
1.87 ± 5% -0.2 1.66 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.faccessat
1.73 ± 5% -0.2 1.52 ± 4% perf-profile.calltrace.cycles-pp.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
1.48 ± 5% -0.2 1.27 ± 4% perf-profile.calltrace.cycles-pp.user_path_at_empty.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall
1.49 ± 5% -0.2 1.29 ± 4% perf-profile.calltrace.cycles-pp.user_path_at_empty.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.faccessat
2.19 ± 2% -0.2 2.02 ± 3% perf-profile.calltrace.cycles-pp.access
1.84 ± 2% -0.2 1.67 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.access
1.86 ± 2% -0.2 1.69 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.access
1.76 ± 2% -0.2 1.59 ± 3% perf-profile.calltrace.cycles-pp.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.access
1.40 ± 2% -0.2 1.24 ± 3% perf-profile.calltrace.cycles-pp.user_path_at_empty.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.access
4.91 ± 4% -0.6 4.28 ± 3% perf-profile.children.cycles-pp.user_path_at_empty
5.28 ± 4% -0.6 4.70 ± 3% perf-profile.children.cycles-pp.do_faccessat
1.39 ± 4% -0.5 0.84 ± 3% perf-profile.children.cycles-pp.getname_flags
0.95 ± 4% -0.5 0.41 ± 3% perf-profile.children.cycles-pp.strncpy_from_user
2.41 ± 5% -0.2 2.19 ± 4% perf-profile.children.cycles-pp.syscall
2.25 ± 2% -0.2 2.08 ± 3% perf-profile.children.cycles-pp.access
0.12 ± 6% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.btrfs_init_metadata_block_rsv
0.08 ± 8% -0.0 0.05 ± 8% perf-profile.children.cycles-pp.btrfs_find_space_info
0.10 ± 4% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.fill_stack_inode_item
0.48 ± 3% -0.1 0.40 ± 3% perf-profile.self.cycles-pp.strncpy_from_user
0.08 ± 8% -0.0 0.05 ± 7% perf-profile.self.cycles-pp.btrfs_find_space_info
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists