[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211005082731.GA15539@xsang-OptiPlex-9020>
Date: Tue, 5 Oct 2021 16:27:31 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Waiman Long <longman@...hat.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Andrii Nakryiko <andrii@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...ux.intel.com
Subject: [rcu] 925da92ba5: will-it-scale.per_process_ops 2.5% improvement
Greeting,
FYI, we noticed a 2.5% improvement of will-it-scale.per_process_ops due to commit:
commit: 925da92ba5cb0c82d07cdd5049a07e40f54e9c44 ("rcu: Avoid unneeded function call in rcu_read_unlock()")
https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git rcu/next
in testcase: will-it-scale
on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
with following parameters:
nr_task: 16
mode: process
test: getppid1
cpufreq_governor: performance
ucode: 0x5003006
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/process/16/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/getppid1/will-it-scale/0x5003006
commit:
f0b2b2df54 ("rcu: Fix existing exp request check in sync_sched_exp_online_cleanup()")
925da92ba5 ("rcu: Avoid unneeded function call in rcu_read_unlock()")
f0b2b2df5423fb36 925da92ba5cb0c82d07cdd5049a
---------------- ---------------------------
%stddev %change %stddev
\ | \
2.029e+08 +2.5% 2.08e+08 will-it-scale.16.processes
12684091 +2.5% 12998034 will-it-scale.per_process_ops
2.029e+08 +2.5% 2.08e+08 will-it-scale.workload
746.67 ± 6% -30.7% 517.33 ± 27% slabinfo.kmalloc-rcl-128.active_objs
746.67 ± 6% -30.7% 517.33 ± 27% slabinfo.kmalloc-rcl-128.num_objs
2816 ±113% -65.6% 968.17 ± 20% interrupts.CPU21.CAL:Function_call_interrupts
2150 ± 33% +155.5% 5494 ± 37% interrupts.CPU52.NMI:Non-maskable_interrupts
2150 ± 33% +155.5% 5494 ± 37% interrupts.CPU52.PMI:Performance_monitoring_interrupts
15629 ± 22% +31.1% 20491 ± 16% softirqs.CPU3.RCU
14781 ± 15% +32.2% 19544 ± 19% softirqs.CPU46.RCU
32737 ± 21% -44.7% 18119 ± 58% softirqs.CPU51.SCHED
4.07 ± 7% -0.9 3.15 ± 12% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.getppid
6.74 ± 5% -1.4 5.34 ± 16% perf-profile.children.cycles-pp.__x64_sys_getppid
1.05 ± 10% -0.2 0.89 ± 9% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
0.96 ± 11% -0.1 0.81 ± 9% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
0.23 ± 12% -0.1 0.17 ± 13% perf-profile.children.cycles-pp.clockevents_program_event
1.36 ± 6% -0.7 0.65 ± 12% perf-profile.self.cycles-pp.__x64_sys_getppid
0.08 ± 8% -0.0 0.06 ± 13% perf-profile.self.cycles-pp.cpuidle_enter_state
9.855e+09 -6.1% 9.255e+09 perf-stat.i.branch-instructions
3709820 ± 2% -13.3% 3215247 ± 11% perf-stat.i.branch-misses
0.97 +2.5% 0.99 perf-stat.i.cpi
1.559e+10 -3.0% 1.512e+10 perf-stat.i.dTLB-loads
0.00 +0.0 0.00 perf-stat.i.dTLB-store-miss-rate%
1.081e+10 -5.3% 1.024e+10 perf-stat.i.dTLB-stores
71.57 ± 9% -13.1 58.45 ± 5% perf-stat.i.iTLB-load-miss-rate%
4.723e+10 -2.5% 4.605e+10 perf-stat.i.instructions
21047 ± 26% +63.2% 34353 ± 9% perf-stat.i.instructions-per-iTLB-miss
1.03 -2.4% 1.01 perf-stat.i.ipc
566.45 -4.5% 540.90 perf-stat.i.metric.M/sec
0.97 +2.5% 0.99 perf-stat.overall.cpi
0.00 +0.0 0.00 perf-stat.overall.dTLB-store-miss-rate%
1.03 -2.4% 1.01 perf-stat.overall.ipc
70024 -4.7% 66743 perf-stat.overall.path-length
9.822e+09 -6.1% 9.224e+09 perf-stat.ps.branch-instructions
3698900 ± 2% -13.2% 3212114 ± 11% perf-stat.ps.branch-misses
1.554e+10 -3.0% 1.507e+10 perf-stat.ps.dTLB-loads
1.077e+10 -5.3% 1.021e+10 perf-stat.ps.dTLB-stores
4.708e+10 -2.5% 4.59e+10 perf-stat.ps.instructions
1.421e+13 -2.3% 1.388e+13 perf-stat.total.instructions
will-it-scale.per_process_ops
1.31e+07 +----------------------------------------------------------------+
1.3e+07 |-+ O O O O O O O O O O |
| O O OO O O O O O O O O O |
1.29e+07 |-+ |
1.28e+07 |-+ |
|. +. .+. .+ |
1.27e+07 |-+.+.+.+.+ + +.+.+.+. .+. .+.+.+.+.+ +.+.+.++ + .+.+.|
1.26e+07 |-+ +.+ +.+.+ ++ + |
1.25e+07 |-+ |
| |
1.24e+07 |-+ |
1.23e+07 |-+ |
| |
1.22e+07 |-+ O O |
1.21e+07 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.15.0-rc1-00014-g925da92ba5cb" of type "text/plain" (169304 bytes)
View attachment "job-script" of type "text/plain" (7964 bytes)
View attachment "job.yaml" of type "text/plain" (5347 bytes)
View attachment "reproduce" of type "text/plain" (340 bytes)
Powered by blists - more mailing lists