[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190718095037.GC27250@shao2-debian>
Date: Thu, 18 Jul 2019 17:50:37 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Nadav Amit <namit@...are.com>
Cc: Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Andy Lutomirski <luto@...capital.net>,
Rick Edgecombe <rick.p.edgecombe@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Andy Lutomirski <luto@...nel.org>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...el.com>,
"H. Peter Anvin" <hpa@...or.com>, Jessica Yu <jeyu@...nel.org>,
Kees Cook <keescook@...omium.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Rik van Riel <riel@...riel.com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [x86/modules] f2c65fb322: will-it-scale.per_process_ops -2.9%
regression
Greeting,
FYI, we noticed a -2.9% regression of will-it-scale.per_process_ops due to commit:
commit: f2c65fb3221adc6b73b0549fc7ba892022db9797 ("x86/modules: Avoid breaking W^X while loading modules")
https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
with following parameters:
nr_task: 100%
mode: process
test: poll1
cpufreq_governor: performance
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-2019-05-14.cgz/lkp-csl-2ap1/poll1/will-it-scale
commit:
7298e24f90 ("x86/kprobes: Set instruction page as executable")
f2c65fb322 ("x86/modules: Avoid breaking W^X while loading modules")
7298e24f904224fa f2c65fb3221adc6b73b0549fc7b
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 50% 2:4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
5017865 -2.9% 4874578 will-it-scale.per_process_ops
9.634e+08 -2.9% 9.359e+08 will-it-scale.workload
33.31 -1.0% 32.99 boot-time.dhcp
10541056 ± 7% +12.1% 11820032 meminfo.DirectMap2M
652.58 ± 7% +10.3% 719.62 ± 8% sched_debug.cfs_rq:/.util_avg.min
19388 ± 14% -17.5% 16003 ± 6% softirqs.CPU134.RCU
1140906 -2.7% 1109655 ± 2% proc-vmstat.pgalloc_normal
1069002 ± 2% -4.7% 1018671 ± 3% proc-vmstat.pgfree
2124 ± 11% -21.8% 1662 ± 4% slabinfo.pool_workqueue.active_objs
2220 ± 11% -21.6% 1742 ± 5% slabinfo.pool_workqueue.num_objs
8708 ± 3% +11.6% 9714 ± 6% numa-vmstat.node0.nr_kernel_stack
6771 ± 7% -12.5% 5924 ± 2% numa-vmstat.node3.nr_kernel_stack
5180 ± 19% -31.7% 3540 ± 13% numa-vmstat.node3.nr_slab_reclaimable
8710 ± 3% +11.5% 9714 ± 6% numa-meminfo.node0.KernelStack
20724 ± 19% -31.7% 14162 ± 13% numa-meminfo.node3.KReclaimable
6773 ± 7% -12.5% 5923 ± 2% numa-meminfo.node3.KernelStack
20724 ± 19% -31.7% 14162 ± 13% numa-meminfo.node3.SReclaimable
82391 ± 9% -12.4% 72177 ± 8% numa-meminfo.node3.Slab
171.00 ±103% -87.6% 21.25 ± 70% interrupts.CPU110.RES:Rescheduling_interrupts
8457 -37.6% 5278 ± 34% interrupts.CPU145.NMI:Non-maskable_interrupts
8457 -37.6% 5278 ± 34% interrupts.CPU145.PMI:Performance_monitoring_interrupts
8453 -37.5% 5285 ± 34% interrupts.CPU146.NMI:Non-maskable_interrupts
8453 -37.5% 5285 ± 34% interrupts.CPU146.PMI:Performance_monitoring_interrupts
8458 -37.5% 5284 ± 34% interrupts.CPU147.NMI:Non-maskable_interrupts
8458 -37.5% 5284 ± 34% interrupts.CPU147.PMI:Performance_monitoring_interrupts
8456 -37.6% 5279 ± 34% interrupts.CPU149.NMI:Non-maskable_interrupts
8456 -37.6% 5279 ± 34% interrupts.CPU149.PMI:Performance_monitoring_interrupts
8449 -37.5% 5278 ± 34% interrupts.CPU151.NMI:Non-maskable_interrupts
8449 -37.5% 5278 ± 34% interrupts.CPU151.PMI:Performance_monitoring_interrupts
14.75 ± 41% +2237.3% 344.75 ± 93% interrupts.CPU154.RES:Rescheduling_interrupts
108.75 ±112% -85.7% 15.50 ± 41% interrupts.CPU168.RES:Rescheduling_interrupts
168.00 ± 19% -41.8% 97.75 ± 40% interrupts.CPU18.RES:Rescheduling_interrupts
1882 ± 87% -94.9% 95.50 ± 50% interrupts.CPU19.RES:Rescheduling_interrupts
218.75 ± 88% +196.9% 649.50 ± 69% interrupts.CPU22.RES:Rescheduling_interrupts
8448 -37.4% 5291 ± 34% interrupts.CPU31.NMI:Non-maskable_interrupts
8448 -37.4% 5291 ± 34% interrupts.CPU31.PMI:Performance_monitoring_interrupts
79.50 ± 23% +537.4% 506.75 ± 75% interrupts.CPU33.RES:Rescheduling_interrupts
8436 -37.5% 5274 ± 34% interrupts.CPU34.NMI:Non-maskable_interrupts
8436 -37.5% 5274 ± 34% interrupts.CPU34.PMI:Performance_monitoring_interrupts
195.50 ±127% +423.1% 1022 ± 47% interrupts.CPU36.RES:Rescheduling_interrupts
43.00 ± 46% +3592.4% 1587 ±150% interrupts.CPU37.RES:Rescheduling_interrupts
82.00 ± 28% +103.4% 166.75 ± 52% interrupts.CPU41.RES:Rescheduling_interrupts
102.75 ± 29% -53.3% 48.00 ± 45% interrupts.CPU50.RES:Rescheduling_interrupts
474.00 ± 30% -73.0% 127.75 ± 75% interrupts.CPU68.RES:Rescheduling_interrupts
8042 ± 4% +6.5% 8565 ± 4% interrupts.CPU95.RES:Rescheduling_interrupts
0.02 -6.0% 0.02 ± 5% perf-stat.i.MPKI
1.111e+11 -2.8% 1.079e+11 perf-stat.i.branch-instructions
2.454e+08 -3.4% 2.371e+08 perf-stat.i.branch-misses
726252 ± 3% -7.8% 669916 ± 3% perf-stat.i.cache-misses
13163065 -8.5% 12044270 ± 5% perf-stat.i.cache-references
1.02 +3.0% 1.05 perf-stat.i.cpi
835228 +7.0% 893984 ± 4% perf-stat.i.cycles-between-cache-misses
1.419e+11 -2.8% 1.38e+11 perf-stat.i.dTLB-loads
0.00 -0.0 0.00 ± 3% perf-stat.i.dTLB-store-miss-rate%
13827 -8.8% 12606 ± 3% perf-stat.i.dTLB-store-misses
1.023e+11 -2.8% 9.937e+10 perf-stat.i.dTLB-stores
2.909e+08 -12.0% 2.56e+08 ± 2% perf-stat.i.iTLB-load-misses
5.448e+11 -2.8% 5.293e+11 perf-stat.i.instructions
1876 +10.4% 2071 ± 2% perf-stat.i.instructions-per-iTLB-miss
0.98 -2.9% 0.95 perf-stat.i.ipc
3297 -1.7% 3241 perf-stat.i.minor-faults
31602 ± 11% -21.5% 24805 ± 15% perf-stat.i.node-loads
3297 -1.7% 3241 perf-stat.i.page-faults
0.02 -5.8% 0.02 ± 5% perf-stat.overall.MPKI
1.02 +3.0% 1.05 perf-stat.overall.cpi
766162 ± 3% +8.5% 831132 ± 3% perf-stat.overall.cycles-between-cache-misses
0.00 -0.0 0.00 ± 3% perf-stat.overall.dTLB-store-miss-rate%
1872 +10.4% 2068 ± 2% perf-stat.overall.instructions-per-iTLB-miss
0.98 -2.9% 0.95 perf-stat.overall.ipc
85.08 +3.2 88.25 ± 2% perf-stat.overall.node-load-miss-rate%
1.107e+11 -2.8% 1.075e+11 perf-stat.ps.branch-instructions
2.446e+08 -3.4% 2.363e+08 perf-stat.ps.branch-misses
723856 ± 3% -7.8% 667682 ± 3% perf-stat.ps.cache-misses
13118781 -8.5% 12003520 ± 5% perf-stat.ps.cache-references
1.414e+11 -2.8% 1.375e+11 perf-stat.ps.dTLB-loads
13781 -8.8% 12564 ± 3% perf-stat.ps.dTLB-store-misses
1.019e+11 -2.8% 9.903e+10 perf-stat.ps.dTLB-stores
2.899e+08 -12.0% 2.551e+08 ± 2% perf-stat.ps.iTLB-load-misses
5.429e+11 -2.8% 5.275e+11 perf-stat.ps.instructions
3286 -1.7% 3230 perf-stat.ps.minor-faults
31502 ± 11% -21.5% 24724 ± 15% perf-stat.ps.node-loads
3286 -1.7% 3230 perf-stat.ps.page-faults
1.623e+14 -3.3% 1.57e+14 perf-stat.total.instructions
25.77 -0.7 25.07 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.poll
3.34 -0.3 3.08 perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
18.82 -0.2 18.60 perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
98.24 -0.2 98.08 perf-profile.calltrace.cycles-pp.poll
4.33 -0.2 4.17 perf-profile.calltrace.cycles-pp.copy_user_generic_unrolled._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
4.96 -0.1 4.86 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.poll
3.32 -0.1 3.21 perf-profile.calltrace.cycles-pp.___might_sleep.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll
1.63 +0.2 1.79 perf-profile.calltrace.cycles-pp.testcase
6.54 +0.2 6.72 perf-profile.calltrace.cycles-pp.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
1.85 +0.2 2.05 perf-profile.calltrace.cycles-pp.__might_sleep.__might_fault._copy_from_user.do_sys_poll.__x64_sys_poll
1.41 +0.3 1.74 ± 2% perf-profile.calltrace.cycles-pp.check_stack_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64
3.44 +0.5 3.91 perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.12 +0.6 1.68 perf-profile.calltrace.cycles-pp.poll_freewait.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
64.34 +0.7 65.03 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.poll
60.51 +0.7 61.25 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
49.87 +0.8 50.68 perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
56.87 +1.0 57.84 perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
17.21 -0.4 16.78 perf-profile.children.cycles-pp.entry_SYSCALL_64
13.72 -0.4 13.32 perf-profile.children.cycles-pp.syscall_return_via_sysret
3.67 -0.3 3.38 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
5.37 -0.3 5.10 perf-profile.children.cycles-pp.copy_user_generic_unrolled
18.32 -0.1 18.20 perf-profile.children.cycles-pp._copy_from_user
3.53 -0.1 3.42 perf-profile.children.cycles-pp.___might_sleep
98.40 -0.1 98.31 perf-profile.children.cycles-pp.poll
1.82 +0.1 1.96 perf-profile.children.cycles-pp.testcase
6.96 +0.2 7.15 perf-profile.children.cycles-pp.__might_fault
1.96 +0.2 2.17 perf-profile.children.cycles-pp.__might_sleep
1.55 +0.3 1.82 perf-profile.children.cycles-pp.check_stack_object
3.79 +0.6 4.40 perf-profile.children.cycles-pp.__check_object_size
1.13 +0.6 1.76 perf-profile.children.cycles-pp.poll_freewait
64.46 +0.7 65.13 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
60.89 +0.7 61.62 perf-profile.children.cycles-pp.do_syscall_64
56.65 +0.9 57.52 perf-profile.children.cycles-pp.__x64_sys_poll
51.27 +1.1 52.38 perf-profile.children.cycles-pp.do_sys_poll
13.70 -0.4 13.30 perf-profile.self.cycles-pp.syscall_return_via_sysret
4.78 -0.3 4.45 perf-profile.self.cycles-pp.__x64_sys_poll
3.40 -0.3 3.15 perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
5.01 -0.2 4.78 perf-profile.self.cycles-pp.copy_user_generic_unrolled
11.45 -0.2 11.25 perf-profile.self.cycles-pp.poll
8.55 -0.2 8.39 perf-profile.self.cycles-pp.entry_SYSCALL_64
3.96 -0.1 3.85 perf-profile.self.cycles-pp.do_syscall_64
3.40 -0.1 3.28 perf-profile.self.cycles-pp.___might_sleep
3.53 -0.1 3.47 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.73 -0.0 0.71 perf-profile.self.cycles-pp.poll_select_set_timeout
2.78 +0.1 2.84 perf-profile.self.cycles-pp._copy_from_user
1.58 +0.1 1.66 perf-profile.self.cycles-pp.__might_fault
1.56 +0.1 1.65 perf-profile.self.cycles-pp.testcase
1.41 +0.2 1.57 perf-profile.self.cycles-pp.check_stack_object
1.73 +0.2 1.94 perf-profile.self.cycles-pp.__might_sleep
2.30 +0.3 2.61 perf-profile.self.cycles-pp.__check_object_size
23.36 +0.3 23.70 perf-profile.self.cycles-pp.do_sys_poll
0.92 +0.6 1.53 perf-profile.self.cycles-pp.poll_freewait
will-it-scale.per_process_ops
6e+06 +-+-----------------------------------------------------------------+
| |
5e+06 +-++...+ +..+...+..+...+..+..+...+..+...+..+..+...+..+...+..|
O O O O O O O O O O O O O O O O O O O
| : : |
4e+06 +-+ : : |
| : : |
3e+06 +-+ : : |
| : : |
2e+06 +-+ : : |
| : : |
| : : |
1e+06 +-+ : : |
| : : |
0 +-+-------O---O-----------------------------------------------------+
will-it-scale.workload
1e+09 +-+-----------------------------------------------------------------+
9e+08 O-+O...O O..O. O O O..O..O...O..O...O..O..O...O. O O..O
| : : |
8e+08 +-+ : : |
7e+08 +-+ : : |
| : : |
6e+08 +-+ : : |
5e+08 +-+ : : |
4e+08 +-+ : : |
| : : |
3e+08 +-+ : : |
2e+08 +-+ : : |
| : : |
1e+08 +-+ : : |
0 +-+-------O---O-----------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.1.0-rc7-00027-gf2c65fb3221adc" of type "text/plain" (189140 bytes)
View attachment "job-script" of type "text/plain" (7329 bytes)
View attachment "job.yaml" of type "text/plain" (4893 bytes)
View attachment "reproduce" of type "text/plain" (310 bytes)
Powered by blists - more mailing lists