[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202211281010.d5c055e1-oliver.sang@intel.com>
Date: Mon, 28 Nov 2022 16:33:42 +0800
From: kernel test robot <oliver.sang@...el.com>
To: John Ogness <john.ogness@...utronix.de>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
Petr Mladek <pmladek@...e.com>,
Thomas Gleixner <tglx@...utronix.de>,
Miguel Ojeda <ojeda@...nel.org>,
"Paul E. McKenney" <paulmck@...nel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>, <ying.huang@...el.com>,
<feng.tang@...el.com>, <zhengjun.xing@...ux.intel.com>,
<fengwei.yin@...el.com>
Subject: [linux-next:master] [printk] 8bdbdd7f43:
will-it-scale.per_process_ops 3.9% improvement
please be noted we didn't figure out the profiling data connecting to this
improvment, but in our tests, the data is very stable.
for this commit:
"will-it-scale.per_process_ops": [
75636,
75623,
75642,
75613,
75592,
75628,
74016,
75637,
75622,
75617,
75605,
74013,
75629,
75618,
75595,
75614,
75611,
75619,
75629,
75619
],
for parent:
"will-it-scale.per_process_ops": [
72665,
72628,
72665,
72656,
72668,
72642,
72660,
72642,
72648,
72661,
72650,
72648,
72651,
72639,
72630,
72641,
72655,
72650,
72624,
72649
],
and thanks a lot Fengwei (Cced) helped review and made below comments:
"This patch could bring performance improvement. It use RCU lock to replace
mutex for some consoles. I expect some lock contention reduction. Didn't find
from perf calltrace profiling. Could see some in perf self profiling. But
without calltrace, we don't know who is the owner of the lock."
so we made below report FYI.
Greeting,
FYI, we noticed a 3.9% improvement of will-it-scale.per_process_ops due to commit:
commit: 8bdbdd7f43cd74c7faca6add8a62d541503ae21d ("printk: Prepare for SRCU console list protection")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: will-it-scale
on test machine: 128 threads 4 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
with following parameters:
nr_task: 50%
mode: process
test: open1
cpufreq_governor: performance
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-11/performance/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp2/open1/will-it-scale
commit:
318eb6d938 ("printk: Convert console_drivers list to hlist")
8bdbdd7f43 ("printk: Prepare for SRCU console list protection")
318eb6d938484a5a 8bdbdd7f43cd74c7faca6add8a6
---------------- ---------------------------
%stddev %change %stddev
\ | \
4649546 +3.9% 4829404 will-it-scale.64.processes
72648 +3.9% 75458 will-it-scale.per_process_ops
4649546 +3.9% 4829404 will-it-scale.workload
346119 +12.7% 390049 meminfo.SUnreclaim
3480 -22.8% 2688 ± 2% vmstat.system.cs
86566 ± 7% +31.1% 113460 ± 5% numa-meminfo.node0.SUnreclaim
65513 ± 13% +40.6% 92118 ± 6% numa-meminfo.node1.SUnreclaim
86566 +12.7% 97547 proc-vmstat.nr_slab_unreclaimable
20789147 +6.0% 22031599 proc-vmstat.numa_hit
20615296 +6.0% 21857885 proc-vmstat.numa_local
80423862 +6.2% 85382185 proc-vmstat.pgalloc_normal
80433923 +6.2% 85390274 proc-vmstat.pgfree
8319772 ± 6% -20.4% 6621861 ± 9% sched_debug.cfs_rq:/.min_vruntime.max
1246763 ± 12% -46.0% 673096 ± 12% sched_debug.cfs_rq:/.min_vruntime.stddev
-3067572 -58.9% -1261852 sched_debug.cfs_rq:/.spread0.min
1246909 ± 12% -46.0% 673159 ± 12% sched_debug.cfs_rq:/.spread0.stddev
6632 ± 3% -15.0% 5638 ± 5% sched_debug.cpu.nr_switches.avg
2796 ± 9% -22.2% 2176 ± 9% sched_debug.cpu.nr_switches.min
1594604 +206.3% 4884774 numa-numastat.node0.local_node
1644243 +199.8% 4929305 numa-numastat.node0.numa_hit
1592483 +206.3% 4878540 numa-numastat.node1.local_node
1638420 +200.1% 4916699 numa-numastat.node1.numa_hit
8690370 -30.6% 6028530 numa-numastat.node2.local_node
8729078 -30.4% 6074883 numa-numastat.node2.numa_hit
8734995 -30.6% 6063318 numa-numastat.node3.local_node
8774563 -30.4% 6107989 numa-numastat.node3.numa_hit
21691 ± 7% +32.4% 28715 ± 5% numa-vmstat.node0.nr_slab_unreclaimable
1644113 +199.8% 4929234 numa-vmstat.node0.numa_hit
1594474 +206.4% 4884703 numa-vmstat.node0.numa_local
16427 ± 12% +42.0% 23321 ± 7% numa-vmstat.node1.nr_slab_unreclaimable
1638283 +200.1% 4916648 numa-vmstat.node1.numa_hit
1592346 +206.4% 4878490 numa-vmstat.node1.numa_local
8728930 -30.4% 6074798 numa-vmstat.node2.numa_hit
8690222 -30.6% 6028445 numa-vmstat.node2.numa_local
8774491 -30.4% 6108017 numa-vmstat.node3.numa_hit
8734922 -30.6% 6063345 numa-vmstat.node3.numa_local
43.31 -0.9 42.42 perf-profile.calltrace.cycles-pp.security_file_open.do_dentry_open.do_open.path_openat.do_filp_open
43.27 -0.9 42.38 perf-profile.calltrace.cycles-pp.apparmor_file_open.security_file_open.do_dentry_open.do_open.path_openat
43.85 -0.9 42.97 perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_filp_open.do_sys_openat2
44.30 -0.9 43.41 perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat
0.63 ± 6% +0.2 0.80 ± 28% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.apparmor_file_open.security_file_open.do_dentry_open.do_open
0.59 ± 6% +0.2 0.77 ± 29% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.apparmor_file_open.security_file_open.do_dentry_open
43.33 -0.9 42.43 perf-profile.children.cycles-pp.security_file_open
43.31 -0.9 42.41 perf-profile.children.cycles-pp.apparmor_file_open
43.89 -0.9 43.00 perf-profile.children.cycles-pp.do_dentry_open
44.32 -0.9 43.43 perf-profile.children.cycles-pp.do_open
0.24 ± 9% -0.1 0.16 ± 11% perf-profile.children.cycles-pp.menu_select
0.19 ± 10% -0.1 0.12 ± 15% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.14 ± 9% -0.1 0.08 ± 11% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.08 ± 10% -0.0 0.03 ± 82% perf-profile.children.cycles-pp.get_next_timer_interrupt
0.11 ± 13% -0.0 0.08 ± 18% perf-profile.children.cycles-pp.tick_nohz_next_event
0.12 ± 7% +0.0 0.15 ± 4% perf-profile.children.cycles-pp.shuffle_freelist
0.14 ± 7% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.allocate_slab
0.20 ± 5% +0.0 0.25 ± 3% perf-profile.children.cycles-pp.___slab_alloc
42.69 -1.1 41.62 ± 2% perf-profile.self.cycles-pp.apparmor_file_open
0.66 ± 11% -0.5 0.16 ± 10% perf-profile.self.cycles-pp.asm_sysvec_apic_timer_interrupt
0.14 ± 9% -0.1 0.08 ± 13% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.10 ± 7% +0.0 0.13 ± 3% perf-profile.self.cycles-pp.shuffle_freelist
2.99 +3.5% 3.09 perf-stat.i.MPKI
5.498e+09 +3.8% 5.706e+09 perf-stat.i.branch-instructions
34.60 -1.7 32.95 perf-stat.i.cache-miss-rate%
28569086 +2.3% 29217344 perf-stat.i.cache-misses
82561656 +7.4% 88636951 perf-stat.i.cache-references
3383 -23.5% 2587 ± 2% perf-stat.i.context-switches
224.62 ± 3% -7.1% 208.68 ± 3% perf-stat.i.cpu-migrations
8.243e+09 +3.7% 8.545e+09 perf-stat.i.dTLB-loads
4.682e+09 +3.6% 4.849e+09 perf-stat.i.dTLB-stores
2.764e+10 +3.8% 2.868e+10 perf-stat.i.instructions
785.42 +5.8% 831.18 perf-stat.i.metric.K/sec
143.88 +3.7% 149.17 perf-stat.i.metric.M/sec
6893040 +4.8% 7226385 perf-stat.i.node-load-misses
206202 ± 2% +10.1% 227052 perf-stat.i.node-loads
76.95 -2.8 74.10 perf-stat.i.node-store-miss-rate%
8358863 -7.7% 7714585 perf-stat.i.node-store-misses
2506806 ± 2% +7.6% 2698200 ± 2% perf-stat.i.node-stores
2.99 +3.5% 3.09 perf-stat.overall.MPKI
34.61 -1.6 32.99 perf-stat.overall.cache-miss-rate%
76.93 -2.8 74.09 perf-stat.overall.node-store-miss-rate%
5.479e+09 +3.8% 5.687e+09 perf-stat.ps.branch-instructions
28481165 +2.3% 29144556 perf-stat.ps.cache-misses
82289248 +7.4% 88361058 perf-stat.ps.cache-references
3371 -23.6% 2577 ± 2% perf-stat.ps.context-switches
223.97 ± 3% -7.0% 208.27 ± 3% perf-stat.ps.cpu-migrations
8.214e+09 +3.7% 8.517e+09 perf-stat.ps.dTLB-loads
4.665e+09 +3.6% 4.833e+09 perf-stat.ps.dTLB-stores
2.754e+10 +3.8% 2.858e+10 perf-stat.ps.instructions
6871767 +4.9% 7207616 perf-stat.ps.node-load-misses
205592 ± 2% +10.1% 226439 perf-stat.ps.node-loads
8330952 -7.7% 7692044 perf-stat.ps.node-store-misses
2498719 ± 2% +7.7% 2690666 ± 2% perf-stat.ps.node-stores
8.33e+12 +3.9% 8.654e+12 perf-stat.total.instructions
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
View attachment "config-6.1.0-rc1-00015-g8bdbdd7f43cd" of type "text/plain" (166072 bytes)
View attachment "job-script" of type "text/plain" (7978 bytes)
View attachment "job.yaml" of type "text/plain" (5554 bytes)
View attachment "reproduce" of type "text/plain" (345 bytes)
Powered by blists - more mailing lists