[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202504221604.38512645-lkp@intel.com>
Date: Tue, 22 Apr 2025 16:42:05 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>, "Paul E. McKenney"
<paulmck@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, Boqun Feng
<boqun.feng@...il.com>, Joel Fernandes <joel@...lfernandes.org>, "Josh
Triplett" <josh@...htriplett.org>, Lai jiangshan <jiangshanlai@...il.com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Mengen Sun
<mengensun@...cent.com>, Steven Rostedt <rostedt@...dmis.org>, "Uladzislau
Rezki (Sony)" <urezki@...il.com>, YueHong Wu <yuehongwu@...cent.com>, Zqiang
<qiang.zhang1211@...il.com>, <oliver.sang@...el.com>
Subject: [linus:master] [ucount] b4dc0bee2a: stress-ng.set.ops_per_sec 7.5%
improvement
Hello,
kernel test robot noticed a 7.5% improvement of stress-ng.set.ops_per_sec on:
commit: b4dc0bee2a749083028afba346910e198653f42a ("ucount: use rcuref_t for reference counting")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:
nr_threads: 100%
testtime: 60s
test: set
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250422/202504221604.38512645-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/set/stress-ng/60s
commit:
5f01a22c5b ("ucount: use RCU for ucounts lookups")
b4dc0bee2a ("ucount: use rcuref_t for reference counting")
5f01a22c5b231dd5 b4dc0bee2a749083028afba3469
---------------- ---------------------------
%stddev %change %stddev
\ | \
10.78 -1.6 9.22 mpstat.cpu.all.soft%
150.17 ± 5% -54.4% 68.50 ± 11% perf-c2c.DRAM.local
14.70 ± 13% -18.0% 12.05 ± 8% vmstat.procs.r
235759 +7.0% 252328 vmstat.system.cs
229993 -1.6% 226301 vmstat.system.in
1456 ± 3% -40.2% 870.70 ± 59% sched_debug.cfs_rq:/.avg_vruntime.min
56228 ±133% -81.2% 10577 ± 72% sched_debug.cfs_rq:/.load.avg
605295 ±186% -89.2% 65458 ± 51% sched_debug.cfs_rq:/.load.stddev
1456 ± 3% -40.2% 870.70 ± 59% sched_debug.cfs_rq:/.min_vruntime.min
259.95 ± 14% -33.5% 172.79 ± 21% sched_debug.cpu.curr->pid.avg
1122 ± 7% -22.0% 874.99 ± 15% sched_debug.cpu.curr->pid.stddev
7692146 +7.5% 8265701 stress-ng.set.ops
128199 +7.5% 137758 stress-ng.set.ops_per_sec
28263 ± 2% -34.1% 18622 ± 2% stress-ng.time.involuntary_context_switches
77524 -1.7% 76216 stress-ng.time.minor_page_faults
750.50 -3.0% 728.33 stress-ng.time.percent_of_cpu_this_job_got
416.18 -3.8% 400.28 stress-ng.time.system_time
7083512 +8.2% 7667679 stress-ng.time.voluntary_context_switches
141813 +4.2% 147721 proc-vmstat.nr_shmem
1695593 +2.9% 1745184 proc-vmstat.numa_hit
1462962 +3.4% 1512676 proc-vmstat.numa_local
99906 ± 9% -23.1% 76793 ± 17% proc-vmstat.numa_pages_migrated
321573 ± 14% -32.0% 218744 ± 22% proc-vmstat.numa_pte_updates
2547220 +3.5% 2636020 proc-vmstat.pgalloc_normal
2441293 +3.4% 2524816 proc-vmstat.pgfree
99906 ± 9% -23.1% 76793 ± 17% proc-vmstat.pgmigrate_success
0.19 ±143% -96.6% 0.01 ± 49% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
0.32 ±133% -72.7% 0.09 ± 16% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.39 ±135% -98.1% 0.01 ± 52% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
240.34 ±217% -98.5% 3.55 ± 6% perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
0.02 ± 8% -19.4% 0.01 ± 8% perf-sched.total_sch_delay.average.ms
3.02 -9.3% 2.74 perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__do_sys_newuname
114.29 ± 2% +18.6% 135.51 ± 2% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
128.00 ± 4% +14.1% 146.00 perf-sched.wait_and_delay.count.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
18816 ± 2% -16.6% 15697 ± 2% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
99.29 ±187% -98.1% 1.85 ±117% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
3.01 -9.3% 2.73 perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.__do_sys_newuname
114.18 ± 2% +18.6% 135.41 ± 2% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
195.32 ±189% -98.6% 2.64 ±124% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
2.37 -1.8% 2.33 perf-stat.i.MPKI
2.841e+09 +3.2% 2.932e+09 perf-stat.i.branch-instructions
1.47 -0.1 1.40 perf-stat.i.branch-miss-rate%
26.62 +1.0 27.64 perf-stat.i.cache-miss-rate%
1.193e+08 -2.5% 1.163e+08 perf-stat.i.cache-references
243399 +7.3% 261109 perf-stat.i.context-switches
7.74 -13.1% 6.73 perf-stat.i.cpi
1.006e+11 -10.6% 8.988e+10 perf-stat.i.cpu-cycles
4164 -7.0% 3871 perf-stat.i.cpu-migrations
3253 -11.7% 2872 perf-stat.i.cycles-between-cache-misses
1.378e+10 +3.1% 1.421e+10 perf-stat.i.instructions
0.15 +15.3% 0.17 perf-stat.i.ipc
1.03 +10.3% 1.13 perf-stat.i.metric.K/sec
2.29 -1.9% 2.25 perf-stat.overall.MPKI
1.65 -0.1 1.58 perf-stat.overall.branch-miss-rate%
26.43 +1.0 27.43 perf-stat.overall.cache-miss-rate%
7.30 -13.3% 6.33 perf-stat.overall.cpi
3189 -11.6% 2818 perf-stat.overall.cycles-between-cache-misses
0.14 +15.4% 0.16 perf-stat.overall.ipc
2.794e+09 +3.2% 2.883e+09 perf-stat.ps.branch-instructions
1.174e+08 -2.6% 1.144e+08 perf-stat.ps.cache-references
239347 +7.3% 256763 perf-stat.ps.context-switches
9.894e+10 -10.6% 8.843e+10 perf-stat.ps.cpu-cycles
4096 -7.0% 3808 perf-stat.ps.cpu-migrations
1.355e+10 +3.1% 1.398e+10 perf-stat.ps.instructions
8.25e+11 +3.2% 8.511e+11 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists