[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20201218063803.GB12524@xsang-OptiPlex-9020>
Date: Fri, 18 Dec 2020 14:38:03 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Steven Rostedt <rostedt@...dmis.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com
Subject: [perf/x86] e506d1dac0: stress-ng.sigsuspend.ops_per_sec 58.5%
improvement
Greeting,
FYI, we noticed a 58.5% improvement of stress-ng.sigsuspend.ops_per_sec due to commit:
commit: e506d1dac0edb2df82f2aa0582e814f9cd9aa07d ("perf/x86: Make dummy_iregs static")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
with following parameters:
nr_threads: 100%
disk: 1HDD
testtime: 30s
class: interrupt
cpufreq_governor: performance
ucode: 0x5002f01
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
interrupt/gcc-9/performance/1HDD/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp7/stress-ng/30s/0x5002f01
commit:
76a4efa809 ("perf/arch: Remove perf_sample_data::regs_user_copy")
e506d1dac0 ("perf/x86: Make dummy_iregs static")
76a4efa80900fc40 e506d1dac0edb2df82f2aa0582e
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:6 17% 1:6 kmsg.BTRFS_error(device_sda1):bdev/dev/sda1_errs:wr#,rd#,flush#,corrupt#,gen
%stddev %change %stddev
\ | \
19676113 ± 2% +38.4% 27232673 stress-ng.sigrt.ops
655554 ± 2% +38.2% 906196 stress-ng.sigrt.ops_per_sec
19953963 ± 4% +58.5% 31619237 stress-ng.sigsuspend.ops
665085 ± 4% +58.5% 1053922 stress-ng.sigsuspend.ops_per_sec
1.199e+08 ± 3% +40.0% 1.677e+08 ± 6% stress-ng.time.involuntary_context_switches
3.251e+08 ± 6% +21.7% 3.958e+08 ± 5% stress-ng.time.voluntary_context_switches
92940673 -2.8% 90309178 interrupts.CAL:Function_call_interrupts
672419 ± 4% +23.8% 832364 ± 5% vmstat.system.cs
46309 ± 4% -9.3% 41992 ± 4% slabinfo.Acpi-State.active_objs
912.33 ± 4% -9.4% 826.17 ± 4% slabinfo.Acpi-State.active_slabs
46559 ± 4% -9.5% 42156 ± 4% slabinfo.Acpi-State.num_objs
912.33 ± 4% -9.4% 826.17 ± 4% slabinfo.Acpi-State.num_slabs
476933 ± 4% +8.1% 515358 ± 3% sched_debug.cpu.avg_idle.avg
2404226 ± 5% +17.0% 2813799 ± 6% sched_debug.cpu.nr_switches.avg
2707460 ± 3% +21.0% 3276452 ± 8% sched_debug.cpu.nr_switches.max
2046442 ± 5% +12.4% 2300906 ± 7% sched_debug.cpu.nr_switches.min
150123 ± 14% +47.4% 221244 ± 20% sched_debug.cpu.nr_switches.stddev
2494371 ± 5% +16.3% 2900267 ± 5% sched_debug.cpu.sched_count.avg
3030416 ± 5% +15.6% 3504469 ± 9% sched_debug.cpu.sched_count.max
1526367 ± 8% +17.8% 1797379 ± 7% sched_debug.cpu.ttwu_count.avg
1737242 ± 7% +18.8% 2064291 ± 7% sched_debug.cpu.ttwu_count.max
1321299 ± 8% +15.6% 1527713 ± 10% sched_debug.cpu.ttwu_count.min
1053766 ± 2% +16.9% 1231415 ± 5% sched_debug.cpu.ttwu_local.avg
1254194 +19.1% 1493169 ± 7% sched_debug.cpu.ttwu_local.max
23.91 -2.4 21.54 ± 6% perf-stat.i.cache-miss-rate%
59213457 ± 4% -17.8% 48681553 ± 9% perf-stat.i.cache-misses
604543 ± 4% +27.6% 771225 ± 4% perf-stat.i.context-switches
22934 ± 5% +30.1% 29829 ± 11% perf-stat.i.cycles-between-cache-misses
4814712 ± 12% +21.2% 5834775 ± 12% perf-stat.i.iTLB-loads
14029258 ± 2% -27.2% 10214682 ± 20% perf-stat.i.node-load-misses
3338 ± 5% +16.6% 3892 ± 9% perf-stat.overall.cycles-between-cache-misses
69.59 -5.9 63.73 ± 3% perf-stat.overall.node-load-miss-rate%
66908064 ± 4% -14.0% 57544495 ± 8% perf-stat.ps.cache-misses
673703 ± 4% +23.6% 832871 ± 5% perf-stat.ps.context-switches
61553 ± 13% +12.4% 69199 ± 12% perf-stat.ps.cpu-migrations
5187429 ± 10% +19.1% 6177199 ± 11% perf-stat.ps.iTLB-loads
13948921 ± 2% -25.7% 10365005 ± 18% perf-stat.ps.node-load-misses
28.53 ± 70% -14.1 14.39 ±141% perf-profile.calltrace.cycles-pp.btrfs_file_write_iter.new_sync_write.vfs_write.ksys_pwrite64.do_syscall_64
28.41 ± 70% -14.1 14.34 ±141% perf-profile.calltrace.cycles-pp.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write.vfs_write.ksys_pwrite64
19.52 ± 70% -9.4 10.10 ±141% perf-profile.calltrace.cycles-pp.btrfs_dirty_pages.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write.vfs_write
13.61 ± 70% -6.7 6.89 ±141% perf-profile.calltrace.cycles-pp.__clear_extent_bit.clear_extent_bit.btrfs_dirty_pages.btrfs_buffered_write.btrfs_file_write_iter
13.61 ± 70% -6.7 6.89 ±141% perf-profile.calltrace.cycles-pp.clear_extent_bit.btrfs_dirty_pages.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write
13.52 ± 70% -6.7 6.84 ±141% perf-profile.calltrace.cycles-pp.clear_state_bit.__clear_extent_bit.clear_extent_bit.btrfs_dirty_pages.btrfs_buffered_write
13.51 ± 70% -6.7 6.83 ±141% perf-profile.calltrace.cycles-pp.btrfs_clear_delalloc_extent.clear_state_bit.__clear_extent_bit.clear_extent_bit.btrfs_dirty_pages
7.95 ± 71% -4.2 3.79 ±142% perf-profile.calltrace.cycles-pp.btrfs_delalloc_reserve_metadata.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write.vfs_write
7.68 ± 71% -4.0 3.66 ±142% perf-profile.calltrace.cycles-pp.btrfs_reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write
7.67 ± 71% -4.0 3.66 ±142% perf-profile.calltrace.cycles-pp.__reserve_bytes.btrfs_reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.btrfs_buffered_write.btrfs_file_write_iter
7.45 ± 71% -3.9 3.57 ±142% perf-profile.calltrace.cycles-pp.btrfs_inode_rsv_release.btrfs_clear_delalloc_extent.clear_state_bit.__clear_extent_bit.clear_extent_bit
7.44 ± 71% -3.9 3.56 ±142% perf-profile.calltrace.cycles-pp.btrfs_block_rsv_release.btrfs_inode_rsv_release.btrfs_clear_delalloc_extent.clear_state_bit.__clear_extent_bit
7.39 ± 71% -3.9 3.54 ±142% perf-profile.calltrace.cycles-pp._raw_spin_lock.btrfs_block_rsv_release.btrfs_inode_rsv_release.btrfs_clear_delalloc_extent.clear_state_bit
7.34 ± 71% -3.8 3.50 ±143% perf-profile.calltrace.cycles-pp._raw_spin_lock.__reserve_bytes.btrfs_reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.btrfs_buffered_write
7.27 ± 71% -3.8 3.48 ±142% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.btrfs_block_rsv_release.btrfs_inode_rsv_release.btrfs_clear_delalloc_extent
7.22 ± 71% -3.8 3.44 ±143% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__reserve_bytes.btrfs_reserve_metadata_bytes.btrfs_delalloc_reserve_metadata
5.75 ± 72% -2.6 3.13 ±142% perf-profile.calltrace.cycles-pp.__set_extent_bit.set_extent_bit.btrfs_set_extent_delalloc.btrfs_dirty_pages.btrfs_buffered_write
5.75 ± 72% -2.6 3.13 ±142% perf-profile.calltrace.cycles-pp.set_extent_bit.btrfs_set_extent_delalloc.btrfs_dirty_pages.btrfs_buffered_write.btrfs_file_write_iter
5.75 ± 72% -2.6 3.13 ±142% perf-profile.calltrace.cycles-pp.btrfs_set_extent_delalloc.btrfs_dirty_pages.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write
5.67 ± 72% -2.6 3.09 ±142% perf-profile.calltrace.cycles-pp.set_state_bits.__set_extent_bit.set_extent_bit.btrfs_set_extent_delalloc.btrfs_dirty_pages
5.67 ± 72% -2.6 3.09 ±142% perf-profile.calltrace.cycles-pp.btrfs_set_delalloc_extent.set_state_bits.__set_extent_bit.set_extent_bit.btrfs_set_extent_delalloc
5.47 ± 72% -2.5 2.98 ±142% perf-profile.calltrace.cycles-pp._raw_spin_lock.btrfs_clear_delalloc_extent.clear_state_bit.__clear_extent_bit.clear_extent_bit
5.43 ± 72% -2.5 2.96 ±142% perf-profile.calltrace.cycles-pp._raw_spin_lock.btrfs_set_delalloc_extent.set_state_bits.__set_extent_bit.set_extent_bit
5.32 ± 73% -2.4 2.91 ±142% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.btrfs_set_delalloc_extent.set_state_bits.__set_extent_bit
5.32 ± 73% -2.4 2.92 ±142% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.btrfs_clear_delalloc_extent.clear_state_bit.__clear_extent_bit
28.53 ± 70% -14.1 14.39 ±141% perf-profile.children.cycles-pp.btrfs_file_write_iter
28.42 ± 70% -14.1 14.34 ±141% perf-profile.children.cycles-pp.btrfs_buffered_write
19.52 ± 70% -9.4 10.10 ±141% perf-profile.children.cycles-pp.btrfs_dirty_pages
13.72 ± 70% -6.8 6.94 ±141% perf-profile.children.cycles-pp.__clear_extent_bit
13.61 ± 70% -6.7 6.89 ±141% perf-profile.children.cycles-pp.clear_extent_bit
13.57 ± 70% -6.7 6.87 ±141% perf-profile.children.cycles-pp.clear_state_bit
13.52 ± 70% -6.7 6.84 ±141% perf-profile.children.cycles-pp.btrfs_clear_delalloc_extent
7.95 ± 71% -4.2 3.79 ±142% perf-profile.children.cycles-pp.btrfs_delalloc_reserve_metadata
7.93 ± 71% -4.2 3.77 ±142% perf-profile.children.cycles-pp.__reserve_bytes
7.68 ± 71% -4.0 3.66 ±142% perf-profile.children.cycles-pp.btrfs_reserve_metadata_bytes
7.48 ± 71% -3.9 3.58 ±142% perf-profile.children.cycles-pp.btrfs_inode_rsv_release
7.46 ± 71% -3.9 3.57 ±142% perf-profile.children.cycles-pp.btrfs_block_rsv_release
5.91 ± 72% -2.7 3.21 ±142% perf-profile.children.cycles-pp.__set_extent_bit
5.81 ± 72% -2.7 3.16 ±142% perf-profile.children.cycles-pp.set_extent_bit
5.75 ± 72% -2.6 3.13 ±142% perf-profile.children.cycles-pp.btrfs_set_extent_delalloc
5.69 ± 72% -2.6 3.10 ±142% perf-profile.children.cycles-pp.set_state_bits
5.68 ± 72% -2.6 3.09 ±142% perf-profile.children.cycles-pp.btrfs_set_delalloc_extent
stress-ng.sigsuspend.ops_per_sec
1.1e+06 +----------------------------------------------------------------+
1.05e+06 |-+ O O O |
| O O |
1e+06 |-+ O O O |
950000 |-O O O O |
| O |
900000 |-+ |
850000 |-+ O |
800000 |-+ |
| |
750000 |-+ + +. |
700000 |-+ +..O + : + +.+ .+. .+. .+.. |
|+ .+.+. + +.+ : .+ + .+ + +.+ .+ +. .+.+. +|
650000 |-+ + +.+ +. + .+ + + |
600000 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "config-5.10.0-rc2-00372-ge506d1dac0ed" of type "text/plain" (171265 bytes)
View attachment "job-script" of type "text/plain" (8183 bytes)
View attachment "job.yaml" of type "text/plain" (5697 bytes)
View attachment "reproduce" of type "text/plain" (464 bytes)
Powered by blists - more mailing lists