[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210311021448.GA9906@xsang-OptiPlex-9020>
Date: Thu, 11 Mar 2021 10:14:49 +0800
From: kernel test robot <oliver.sang@...el.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Petr Mladek <pmladek@...e.com>,
LKML <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [printk] 996e966640: stress-ng.klog.ops_per_sec 1097.4% improvement
Greeting,
FYI, we noticed a 1097.4% improvement of stress-ng.klog.ops_per_sec due to commit:
commit: 996e966640ddea7b535cbe7b749e87a3b24f37e8 ("printk: remove logbuf_lock")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:
nr_threads: 10%
disk: 1HDD
testtime: 60s
fs: ext4
class: os
test: klog
cpufreq_governor: performance
ucode: 0x5003006
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml
=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
os/gcc-9/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/klog/stress-ng/60s/0x5003006
commit:
f9f3f02db9 ("printk: introduce a kmsg_dump iterator")
996e966640 ("printk: remove logbuf_lock")
f9f3f02db98bbe67 996e966640ddea7b535cbe7b749
---------------- ---------------------------
%stddev %change %stddev
\ | \
26314 +1097.3% 315061 stress-ng.klog.ops
438.54 +1097.4% 5250 stress-ng.klog.ops_per_sec
961.83 ± 2% -15.9% 809.00 stress-ng.time.involuntary_context_switches
8450 ± 39% +4572.7% 394845 ±162% cpuidle.POLL.usage
7225 ± 22% +27.2% 9187 ± 11% softirqs.CPU2.SCHED
1844 -3.7% 1775 vmstat.system.cs
176153 ± 70% -50.0% 88051 ±141% numa-vmstat.node0.nr_dirtied
176152 ± 70% -50.0% 88051 ±141% numa-vmstat.node0.nr_written
88056 ±141% +100.1% 176157 ± 70% numa-vmstat.node1.nr_dirtied
88055 ±141% +100.1% 176156 ± 70% numa-vmstat.node1.nr_written
38516 +39.4% 53679 ± 27% sched_debug.cpu.clock.avg
38521 +39.4% 53683 ± 27% sched_debug.cpu.clock.max
38510 +39.4% 53674 ± 27% sched_debug.cpu.clock.min
38388 +39.2% 53441 ± 27% sched_debug.cpu.clock_task.avg
38507 +39.1% 53581 ± 27% sched_debug.cpu.clock_task.max
38513 +39.4% 53676 ± 27% sched_debug.cpu_clk
38019 +39.9% 53181 ± 28% sched_debug.ktime
38879 +39.0% 54033 ± 27% sched_debug.sched_clk
4.47 ± 7% -92.3% 0.35 ± 37% perf-stat.i.MPKI
1.824e+09 +709.2% 1.476e+10 perf-stat.i.branch-instructions
0.49 ± 7% -0.2 0.28 ± 4% perf-stat.i.branch-miss-rate%
8126698 ± 3% +291.8% 31840686 perf-stat.i.branch-misses
30.49 ± 10% -14.0 16.46 ± 3% perf-stat.i.cache-miss-rate%
9809108 ± 8% -83.9% 1581490 ± 3% perf-stat.i.cache-misses
32576128 ± 4% -70.0% 9756759 ± 6% perf-stat.i.cache-references
1700 -4.5% 1623 perf-stat.i.context-switches
3.61 -86.3% 0.49 perf-stat.i.cpi
106.88 -1.6% 105.21 perf-stat.i.cpu-migrations
2857 ± 7% +557.6% 18789 ± 4% perf-stat.i.cycles-between-cache-misses
1.543e+09 +608.5% 1.093e+10 ± 2% perf-stat.i.dTLB-loads
7.862e+08 +893.9% 7.813e+09 perf-stat.i.dTLB-stores
46.51 +3.6 50.13 ± 5% perf-stat.i.iTLB-load-miss-rate%
1764355 +14.1% 2013317 perf-stat.i.iTLB-load-misses
7.571e+09 +699.3% 6.051e+10 perf-stat.i.instructions
4308 +598.5% 30094 perf-stat.i.instructions-per-iTLB-miss
0.29 +636.7% 2.13 perf-stat.i.ipc
12.32 -8.6% 11.26 perf-stat.i.major-faults
0.63 ± 11% +72.3% 1.08 ± 12% perf-stat.i.metric.K/sec
43.60 +700.6% 349.10 perf-stat.i.metric.M/sec
3214 -2.0% 3149 perf-stat.i.minor-faults
97.27 -10.4 86.91 perf-stat.i.node-load-miss-rate%
2442015 ± 9% -89.9% 246595 ± 17% perf-stat.i.node-load-misses
96.91 -11.0 85.92 ± 8% perf-stat.i.node-store-miss-rate%
1985853 ± 9% -97.0% 59228 ± 6% perf-stat.i.node-store-misses
3226 -2.0% 3160 perf-stat.i.page-faults
4.29 ± 3% -96.2% 0.16 ± 6% perf-stat.overall.MPKI
0.44 ± 3% -0.2 0.22 perf-stat.overall.branch-miss-rate%
30.37 ± 11% -14.1 16.27 ± 6% perf-stat.overall.cache-miss-rate%
3.64 -87.4% 0.46 perf-stat.overall.cpi
2824 ± 7% +522.1% 17572 ± 3% perf-stat.overall.cycles-between-cache-misses
0.00 ± 81% -0.0 0.00 ± 74% perf-stat.overall.dTLB-load-miss-rate%
0.00 ± 31% -0.0 0.00 ± 30% perf-stat.overall.dTLB-store-miss-rate%
46.46 +3.7 50.11 ± 5% perf-stat.overall.iTLB-load-miss-rate%
4306 +598.3% 30070 perf-stat.overall.instructions-per-iTLB-miss
0.27 +694.3% 2.18 perf-stat.overall.ipc
98.28 -13.2 85.05 perf-stat.overall.node-load-miss-rate%
99.06 -21.7 77.41 ± 6% perf-stat.overall.node-store-miss-rate%
1.795e+09 +709.1% 1.452e+10 perf-stat.ps.branch-instructions
7911307 ± 3% +296.0% 31331201 perf-stat.ps.branch-misses
9665170 ± 8% -83.9% 1555589 ± 3% perf-stat.ps.cache-misses
31965660 ± 4% -70.0% 9598649 ± 6% perf-stat.ps.cache-references
1664 -4.1% 1597 perf-stat.ps.context-switches
104.80 -1.3% 103.49 perf-stat.ps.cpu-migrations
1.518e+09 +608.7% 1.076e+10 ± 2% perf-stat.ps.dTLB-loads
7.733e+08 +894.4% 7.689e+09 perf-stat.ps.dTLB-stores
1730085 +14.5% 1980925 perf-stat.ps.iTLB-load-misses
7.449e+09 +699.5% 5.955e+10 perf-stat.ps.instructions
2408929 ± 9% -89.9% 242555 ± 17% perf-stat.ps.node-load-misses
1959947 ± 9% -97.0% 58243 ± 6% perf-stat.ps.node-store-misses
4.701e+11 +699.2% 3.757e+12 perf-stat.total.instructions
58.25 ± 7% -58.3 0.00 perf-profile.calltrace.cycles-pp._raw_spin_lock.syslog_print_all.do_syslog.__x64_sys_syslog.do_syscall_64
57.16 ± 7% -57.2 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.syslog_print_all.do_syslog.__x64_sys_syslog
0.80 ± 8% +0.2 1.00 ± 4% perf-profile.calltrace.cycles-pp.desc_read.desc_read_finalized_seq._prb_read_valid.prb_read_valid.syslog_print_all
0.00 +0.6 0.57 ± 4% perf-profile.calltrace.cycles-pp.put_dec_trunc8.number.vsnprintf.sprintf.info_print_prefix
0.00 +0.6 0.58 ± 3% perf-profile.calltrace.cycles-pp.memmove.record_print_text.syslog_print_all.do_syslog.__x64_sys_syslog
0.00 +0.6 0.61 ± 7% perf-profile.calltrace.cycles-pp.copy_user_generic_unrolled._copy_to_user.syslog_print_all.do_syslog.__x64_sys_syslog
0.88 ± 8% +0.7 1.61 ± 3% perf-profile.calltrace.cycles-pp.desc_read_finalized_seq._prb_read_valid.prb_read_valid.syslog_print_all.do_syslog
0.00 +0.7 0.74 ± 4% perf-profile.calltrace.cycles-pp.memcpy_erms._prb_read_valid.prb_read_valid.syslog_print_all.do_syslog
0.00 +0.8 0.84 ± 5% perf-profile.calltrace.cycles-pp.memcpy_erms.record_print_text.syslog_print_all.do_syslog.__x64_sys_syslog
0.00 +1.3 1.28 ± 4% perf-profile.calltrace.cycles-pp.desc_read.desc_read_finalized_seq._prb_read_valid.prb_read_valid_info.find_first_fitting_seq
0.00 +1.4 1.40 ± 4% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string._copy_to_user.syslog_print_all.do_syslog.__x64_sys_syslog
0.00 +1.4 1.44 ± 4% perf-profile.calltrace.cycles-pp.__check_heap_object.__check_object_size.syslog_print_all.do_syslog.__x64_sys_syslog
0.00 +2.0 2.02 ± 4% perf-profile.calltrace.cycles-pp.memchr.record_print_text.syslog_print_all.do_syslog.__x64_sys_syslog
0.00 +2.0 2.03 ± 2% perf-profile.calltrace.cycles-pp.desc_read_finalized_seq._prb_read_valid.prb_read_valid_info.find_first_fitting_seq.syslog_print_all
0.00 +2.2 2.20 ± 4% perf-profile.calltrace.cycles-pp.memchr._prb_read_valid.prb_read_valid_info.find_first_fitting_seq.syslog_print_all
0.00 +2.9 2.93 ± 2% perf-profile.calltrace.cycles-pp.__check_object_size.syslog_print_all.do_syslog.__x64_sys_syslog.do_syscall_64
0.00 +2.9 2.93 ± 3% perf-profile.calltrace.cycles-pp._copy_to_user.syslog_print_all.do_syslog.__x64_sys_syslog.do_syscall_64
1.12 ± 8% +3.3 4.44 ± 4% perf-profile.calltrace.cycles-pp._prb_read_valid.prb_read_valid.syslog_print_all.do_syslog.__x64_sys_syslog
1.13 ± 8% +3.5 4.63 ± 4% perf-profile.calltrace.cycles-pp.prb_read_valid.syslog_print_all.do_syslog.__x64_sys_syslog.do_syscall_64
0.00 +3.6 3.63 ± 3% perf-profile.calltrace.cycles-pp.memcpy_erms.vsnprintf.sprintf.info_print_prefix.record_print_text
0.00 +3.9 3.86 ± 3% perf-profile.calltrace.cycles-pp.number.vsnprintf.sprintf.info_print_prefix.record_print_text
0.00 +4.2 4.24 ± 2% perf-profile.calltrace.cycles-pp.memcpy_erms.vsnprintf.sprintf.info_print_prefix.find_first_fitting_seq
0.00 +4.6 4.56 ± 4% perf-profile.calltrace.cycles-pp.number.vsnprintf.sprintf.info_print_prefix.find_first_fitting_seq
0.00 +5.1 5.09 ± 3% perf-profile.calltrace.cycles-pp.format_decode.vsnprintf.sprintf.info_print_prefix.record_print_text
0.00 +6.1 6.08 ± 6% perf-profile.calltrace.cycles-pp.format_decode.vsnprintf.sprintf.info_print_prefix.find_first_fitting_seq
0.47 ± 44% +6.4 6.84 ± 4% perf-profile.calltrace.cycles-pp._prb_read_valid.prb_read_valid_info.find_first_fitting_seq.syslog_print_all.do_syslog
0.49 ± 44% +6.6 7.12 ± 3% perf-profile.calltrace.cycles-pp.prb_read_valid_info.find_first_fitting_seq.syslog_print_all.do_syslog.__x64_sys_syslog
1.41 ± 6% +15.7 17.08 ± 3% perf-profile.calltrace.cycles-pp.vsnprintf.sprintf.info_print_prefix.record_print_text.syslog_print_all
1.50 ± 7% +16.8 18.34 ± 3% perf-profile.calltrace.cycles-pp.sprintf.info_print_prefix.record_print_text.syslog_print_all.do_syslog
1.57 ± 7% +17.7 19.25 ± 3% perf-profile.calltrace.cycles-pp.info_print_prefix.record_print_text.syslog_print_all.do_syslog.__x64_sys_syslog
1.61 ± 6% +18.4 20.00 ± 3% perf-profile.calltrace.cycles-pp.vsnprintf.sprintf.info_print_prefix.find_first_fitting_seq.syslog_print_all
1.70 ± 6% +19.8 21.54 ± 3% perf-profile.calltrace.cycles-pp.sprintf.info_print_prefix.find_first_fitting_seq.syslog_print_all.do_syslog
1.78 ± 6% +20.8 22.59 ± 3% perf-profile.calltrace.cycles-pp.info_print_prefix.find_first_fitting_seq.syslog_print_all.do_syslog.__x64_sys_syslog
2.08 ± 7% +22.4 24.43 ± 3% perf-profile.calltrace.cycles-pp.record_print_text.syslog_print_all.do_syslog.__x64_sys_syslog.do_syscall_64
2.39 ± 7% +27.8 30.14 ± 3% perf-profile.calltrace.cycles-pp.find_first_fitting_seq.syslog_print_all.do_syslog.__x64_sys_syslog.do_syscall_64
58.35 ± 7% -58.3 0.04 ± 71% perf-profile.children.cycles-pp._raw_spin_lock
57.20 ± 7% -57.2 0.00 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.00 +0.1 0.11 ± 7% perf-profile.children.cycles-pp.check_stack_object
0.00 +0.1 0.12 ± 3% perf-profile.children.cycles-pp.put_dec
0.00 +0.1 0.14 ± 5% perf-profile.children.cycles-pp.___might_sleep
0.00 +0.1 0.15 ± 5% perf-profile.children.cycles-pp.__printk_safe_enter
0.00 +0.2 0.18 ± 4% perf-profile.children.cycles-pp.__might_sleep
0.01 ±223% +0.4 0.41 ± 5% perf-profile.children.cycles-pp.__virt_addr_valid
0.01 ±223% +0.4 0.43 ± 4% perf-profile.children.cycles-pp.__might_fault
0.01 ±223% +0.5 0.46 ± 7% perf-profile.children.cycles-pp.memcpy
0.06 ± 9% +0.6 0.62 ± 3% perf-profile.children.cycles-pp.memmove
0.07 ± 14% +0.7 0.72 ± 7% perf-profile.children.cycles-pp.copy_user_generic_unrolled
0.07 ± 19% +0.7 0.79 ± 5% perf-profile.children.cycles-pp.get_data
0.09 ± 11% +0.9 1.01 ± 3% perf-profile.children.cycles-pp.put_dec_trunc8
0.09 ± 16% +1.1 1.19 ± 3% perf-profile.children.cycles-pp.skip_atoi
0.12 ± 16% +1.4 1.48 ± 4% perf-profile.children.cycles-pp.__check_heap_object
0.14 ± 7% +1.4 1.54 ± 4% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
0.90 ± 8% +1.4 2.31 ± 4% perf-profile.children.cycles-pp.desc_read
0.26 ± 6% +2.7 2.97 ± 3% perf-profile.children.cycles-pp._copy_to_user
1.04 ± 8% +2.8 3.81 ± 3% perf-profile.children.cycles-pp.desc_read_finalized_seq
0.24 ± 10% +2.8 3.06 ± 2% perf-profile.children.cycles-pp.__check_object_size
1.14 ± 8% +3.5 4.68 ± 4% perf-profile.children.cycles-pp.prb_read_valid
0.40 ± 7% +4.3 4.67 ± 4% perf-profile.children.cycles-pp.memchr
0.57 ± 8% +6.6 7.17 ± 3% perf-profile.children.cycles-pp.prb_read_valid_info
0.70 ± 9% +7.9 8.56 ± 4% perf-profile.children.cycles-pp.number
0.83 ± 8% +9.3 10.13 ± 2% perf-profile.children.cycles-pp.memcpy_erms
1.69 ± 7% +9.8 11.47 ± 3% perf-profile.children.cycles-pp._prb_read_valid
0.89 ± 4% +10.3 11.20 ± 3% perf-profile.children.cycles-pp.format_decode
2.09 ± 7% +22.4 24.50 ± 3% perf-profile.children.cycles-pp.record_print_text
2.39 ± 7% +27.8 30.18 ± 3% perf-profile.children.cycles-pp.find_first_fitting_seq
3.08 ± 6% +35.2 38.28 ± 3% perf-profile.children.cycles-pp.vsnprintf
3.21 ± 6% +36.9 40.07 ± 3% perf-profile.children.cycles-pp.sprintf
3.37 ± 6% +38.6 41.98 ± 3% perf-profile.children.cycles-pp.info_print_prefix
57.20 ± 7% -57.2 0.00 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.15 ± 10% -1.1 0.04 ± 71% perf-profile.self.cycles-pp._raw_spin_lock
0.00 +0.1 0.08 ± 11% perf-profile.self.cycles-pp.check_stack_object
0.00 +0.1 0.10 ± 10% perf-profile.self.cycles-pp.__might_fault
0.00 +0.1 0.11 ± 5% perf-profile.self.cycles-pp.__printk_safe_enter
0.00 +0.1 0.12 ± 3% perf-profile.self.cycles-pp.put_dec
0.00 +0.1 0.14 ± 5% perf-profile.self.cycles-pp.___might_sleep
0.00 +0.2 0.15 ± 7% perf-profile.self.cycles-pp.prb_read_valid
0.00 +0.2 0.18 ± 4% perf-profile.self.cycles-pp.__might_sleep
0.00 +0.3 0.26 ± 5% perf-profile.self.cycles-pp.prb_read_valid_info
0.00 +0.3 0.31 ± 4% perf-profile.self.cycles-pp._copy_to_user
0.00 +0.4 0.38 ± 3% perf-profile.self.cycles-pp.__virt_addr_valid
0.00 +0.4 0.38 ± 7% perf-profile.self.cycles-pp.find_first_fitting_seq
0.01 ±223% +0.4 0.42 ± 8% perf-profile.self.cycles-pp.memcpy
0.14 ± 14% +0.5 0.61 ± 5% perf-profile.self.cycles-pp.syslog_print_all
0.06 ± 9% +0.6 0.61 ± 3% perf-profile.self.cycles-pp.memmove
0.06 ± 14% +0.6 0.70 ± 6% perf-profile.self.cycles-pp.copy_user_generic_unrolled
0.06 ± 21% +0.7 0.75 ± 5% perf-profile.self.cycles-pp.get_data
0.09 ± 11% +0.9 1.01 ± 3% perf-profile.self.cycles-pp.put_dec_trunc8
0.08 ± 18% +1.0 1.10 ± 3% perf-profile.self.cycles-pp.skip_atoi
0.09 ± 12% +1.0 1.13 ± 4% perf-profile.self.cycles-pp.__check_object_size
0.90 ± 8% +1.3 2.17 ± 4% perf-profile.self.cycles-pp.desc_read
0.16 ± 9% +1.3 1.44 ± 5% perf-profile.self.cycles-pp.record_print_text
0.11 ± 15% +1.3 1.44 ± 4% perf-profile.self.cycles-pp.__check_heap_object
0.13 ± 5% +1.4 1.50 ± 4% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
0.14 ± 11% +1.4 1.56 ± 4% perf-profile.self.cycles-pp.desc_read_finalized_seq
0.14 ± 6% +1.6 1.71 ± 4% perf-profile.self.cycles-pp.sprintf
0.16 ± 13% +1.8 1.98 ± 5% perf-profile.self.cycles-pp.info_print_prefix
0.30 ± 7% +3.4 3.70 ± 4% perf-profile.self.cycles-pp._prb_read_valid
0.40 ± 8% +4.2 4.64 ± 4% perf-profile.self.cycles-pp.memchr
0.59 ± 8% +6.6 7.23 ± 4% perf-profile.self.cycles-pp.number
0.78 ± 3% +8.5 9.26 ± 4% perf-profile.self.cycles-pp.format_decode
0.79 ± 8% +8.9 9.71 ± 2% perf-profile.self.cycles-pp.memcpy_erms
0.84 ± 6% +10.3 11.11 ± 4% perf-profile.self.cycles-pp.vsnprintf
stress-ng.klog.ops
350000 +------------------------------------------------------------------+
|O OO OO OO OO OO OO OO OO OO OO OO OO OO OO OO |
300000 |-+ |
| |
250000 |-+ |
| |
200000 |-+ |
| |
150000 |-+ |
| |
100000 |-+ |
| |
50000 |-+ |
|+.++.++.++.++.++.++.++.++.++.++.++.++.++.++.++.++.++.++.++.++.++.+|
0 +------------------------------------------------------------------+
stress-ng.klog.ops_per_sec
5500 +--------------------------------------------------------------------+
5000 |-OO OO OO OO OO OO OO O OO OO OO OO OO OO OO O O |
| |
4500 |-+ |
4000 |-+ |
3500 |-+ |
3000 |-+ |
| |
2500 |-+ |
2000 |-+ |
1500 |-+ |
1000 |-+ |
| |
500 |.++.++.++.++.++.++.++.+.++.++.++.++.++.++.++.+.++.++.++.++.++.++.++.|
0 +--------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.11.0-09233-g996e966640dd" of type "text/plain" (172310 bytes)
View attachment "job-script" of type "text/plain" (8103 bytes)
View attachment "job.yaml" of type "text/plain" (5601 bytes)
View attachment "reproduce" of type "text/plain" (532 bytes)
Powered by blists - more mailing lists