[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20181127062145.GG6163@shao2-debian>
Date: Tue, 27 Nov 2018 14:21:45 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: NeilBrown <neilb@...e.com>
Cc: Jeff Layton <jlayton@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Jeff Layton <jlayton@...hat.com>, lkp@...org
Subject: [LKP] [fs/locks] 816f2fb5a2: will-it-scale.per_process_ops -93.1%
regression
Greeting,
FYI, we noticed a -93.1% regression of will-it-scale.per_process_ops due to commit:
commit: 816f2fb5a2fc678c2595ebf1bc384c019edffc97 ("fs/locks: allow a lock request to block other requests.")
https://git.kernel.org/cgit/linux/kernel/git/jlayton/linux.git locks-next
in testcase: will-it-scale
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
with following parameters:
nr_task: 100%
mode: process
test: lock1
ucode: 0xb00002e
cpufreq_governor: performance
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
In addition to that, the commit also has significant impact on the following tests:
+------------------+----------------------------------------------------------------------+
| testcase: change | will-it-scale: |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=50% |
| | test=lock1 |
+------------------+----------------------------------------------------------------------+
| testcase: change | will-it-scale: |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters | cpufreq_governor=performance |
| | mode=thread |
| | nr_task=100% |
| | test=lock1 |
| | ucode=0xb00002e |
+------------------+----------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.flock.ops -93.2% undefined |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters | class=filesystem |
| | cpufreq_governor=performance |
| | disk=1HDD |
| | nr_threads=100% |
| | testtime=1s |
| | ucode=0xb00002e |
+------------------+----------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.2/process/100%/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3b/lock1/will-it-scale/0xb00002e
commit:
48a7a13ff3 ("locks: use properly initialized file_lock when unlocking.")
816f2fb5a2 ("fs/locks: allow a lock request to block other requests.")
48a7a13ff31f0728 816f2fb5a2fc678c2595ebf1bc
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
:4 25% 1:4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
1040157 -93.1% 71598 will-it-scale.per_process_ops
7964 -1.5% 7847 will-it-scale.time.maximum_resident_set_size
91533877 -93.1% 6300666 will-it-scale.workload
1428052 ± 3% -50.6% 705034 ± 13% softirqs.RCU
1205 -9.4% 1091 vmstat.system.cs
292808 ± 12% -14.0% 251848 ± 8% meminfo.DirectMap4k
74179 ± 3% +13.2% 83951 ± 3% meminfo.Shmem
84.74 +13.4 98.19 mpstat.cpu.sys%
14.61 -13.6 1.01 mpstat.cpu.usr%
141844 ± 16% +29.6% 183808 ± 16% numa-meminfo.node1.Active
141841 ± 16% +29.6% 183806 ± 16% numa-meminfo.node1.Active(anon)
35481 ± 16% +29.5% 45936 ± 16% numa-vmstat.node1.nr_active_anon
35481 ± 16% +29.5% 45936 ± 16% numa-vmstat.node1.nr_zone_active_anon
2751 +0.9% 2777 turbostat.Avg_MHz
0.54 ± 2% -59.3% 0.22 ± 64% turbostat.CPU%c1
280.24 -20.4% 223.16 turbostat.PkgWatt
6.43 ± 2% +17.1% 7.52 turbostat.RAMWatt
78497 +2.8% 80729 proc-vmstat.nr_active_anon
4536 +2.8% 4662 proc-vmstat.nr_inactive_anon
7002 +2.3% 7161 proc-vmstat.nr_mapped
18546 ± 3% +13.2% 20990 ± 3% proc-vmstat.nr_shmem
78497 +2.8% 80729 proc-vmstat.nr_zone_active_anon
4536 +2.8% 4662 proc-vmstat.nr_zone_inactive_anon
688482 +1.0% 695551 proc-vmstat.numa_hit
671323 +1.1% 678385 proc-vmstat.numa_local
20467 ± 4% +15.7% 23691 ± 4% proc-vmstat.pgactivate
14.92 ± 6% +18.1% 17.62 ± 3% sched_debug.cfs_rq:/.load_avg.avg
23.49 ± 14% +36.2% 31.99 ± 4% sched_debug.cfs_rq:/.load_avg.stddev
0.05 ± 6% +13.8% 0.05 ± 7% sched_debug.cfs_rq:/.nr_running.stddev
0.49 ±172% +490.9% 2.87 ± 32% sched_debug.cfs_rq:/.removed.load_avg.avg
4.53 ±172% +374.8% 21.49 ± 15% sched_debug.cfs_rq:/.removed.load_avg.stddev
22.30 ±172% +491.0% 131.81 ± 32% sched_debug.cfs_rq:/.removed.runnable_sum.avg
208.03 ±172% +374.7% 987.52 ± 16% sched_debug.cfs_rq:/.removed.runnable_sum.stddev
0.25 ±172% +407.5% 1.25 ± 40% sched_debug.cfs_rq:/.removed.util_avg.avg
2.30 ±172% +307.3% 9.35 ± 27% sched_debug.cfs_rq:/.removed.util_avg.stddev
45.75 ± 13% +18.9% 54.38 ± 9% sched_debug.cpu.cpu_load[4].max
4.20 ± 15% +19.8% 5.04 ± 10% sched_debug.cpu.cpu_load[4].stddev
13611 ± 25% +358.4% 62395 ± 20% sched_debug.cpu.nr_switches.max
801.46 ± 5% -59.6% 323.42 ± 4% sched_debug.cpu.nr_switches.min
2434 ± 8% +183.6% 6905 ± 15% sched_debug.cpu.nr_switches.stddev
521.46 ± 7% -80.6% 101.08 ± 6% sched_debug.cpu.sched_count.min
5143 ± 10% +69.5% 8720 ± 17% sched_debug.cpu.sched_count.stddev
3869 ± 9% +648.3% 28955 ± 19% sched_debug.cpu.ttwu_count.max
273.25 ± 6% -70.7% 80.00 ± 2% sched_debug.cpu.ttwu_count.min
821.23 ± 8% +286.2% 3171 ± 15% sched_debug.cpu.ttwu_count.stddev
2130 ± 10% +1256.0% 28887 ± 19% sched_debug.cpu.ttwu_local.max
239.08 ± 4% -81.6% 44.04 ± 3% sched_debug.cpu.ttwu_local.min
378.58 ± 4% +734.4% 3158 ± 15% sched_debug.cpu.ttwu_local.stddev
1.14e+13 -59.9% 4.568e+12 perf-stat.branch-instructions
0.98 -0.7 0.25 perf-stat.branch-miss-rate%
1.122e+11 -90.0% 1.123e+10 perf-stat.branch-misses
1.85 ± 6% +37.4 39.28 perf-stat.cache-miss-rate%
50742484 ± 7% +14238.2% 7.276e+09 perf-stat.cache-misses
2.747e+09 ± 6% +574.5% 1.853e+10 perf-stat.cache-references
355461 -8.9% 323737 perf-stat.context-switches
1.29 +199.8% 3.86 perf-stat.cpi
7.288e+13 +1.1% 7.367e+13 perf-stat.cpu-cycles
16159 -35.8% 10371 perf-stat.cpu-migrations
0.00 ± 3% +0.0 0.00 ± 3% perf-stat.dTLB-load-miss-rate%
64071048 ± 3% +39.2% 89198727 ± 3% perf-stat.dTLB-load-misses
1.664e+13 -70.3% 4.939e+12 perf-stat.dTLB-loads
0.00 ± 38% +0.0 0.01 ± 2% perf-stat.dTLB-store-miss-rate%
1.177e+13 -92.8% 8.524e+11 perf-stat.dTLB-stores
98.35 -20.8 77.54 perf-stat.iTLB-load-miss-rate%
5.521e+10 -92.9% 3.921e+09 perf-stat.iTLB-load-misses
5.658e+13 -66.3% 1.908e+13 perf-stat.instructions
1024 +374.7% 4864 perf-stat.instructions-per-iTLB-miss
0.78 -66.6% 0.26 perf-stat.ipc
88.10 ± 2% +11.8 99.87 perf-stat.node-load-miss-rate%
16078162 +10342.5% 1.679e+09 perf-stat.node-load-misses
42.71 ± 27% +36.1 78.79 perf-stat.node-store-miss-rate%
4954835 ± 31% +36937.8% 1.835e+09 perf-stat.node-store-misses
6456394 ± 17% +7551.4% 4.94e+08 perf-stat.node-stores
618105 +389.8% 3027476 perf-stat.path-length
15.76 -15.0 0.72 perf-profile.calltrace.cycles-pp.locks_alloc_lock.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl
13.32 -12.7 0.62 perf-profile.calltrace.cycles-pp.kmem_cache_alloc.locks_alloc_lock.posix_lock_inode.do_lock_file_wait.fcntl_setlk
10.67 -10.3 0.40 ± 57% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
10.40 -10.1 0.26 ±100% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
7.64 -7.6 0.00 perf-profile.calltrace.cycles-pp.locks_alloc_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
6.44 -6.4 0.00 perf-profile.calltrace.cycles-pp.security_file_lock.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl
6.42 -6.4 0.00 perf-profile.calltrace.cycles-pp.kmem_cache_alloc.locks_alloc_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl
5.94 -5.9 0.00 perf-profile.calltrace.cycles-pp.security_file_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.17 -5.2 0.00 perf-profile.calltrace.cycles-pp._copy_from_user.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.04 -5.0 0.00 perf-profile.calltrace.cycles-pp.memset_erms.kmem_cache_alloc.locks_alloc_lock.posix_lock_inode.do_lock_file_wait
1.13 -0.6 0.52 ± 3% perf-profile.calltrace.cycles-pp.locks_insert_lock_ctx.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl
76.72 +22.1 98.80 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
75.87 +22.9 98.76 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
73.37 +25.3 98.62 perf-profile.calltrace.cycles-pp.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
62.79 +35.3 98.10 perf-profile.calltrace.cycles-pp.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
56.50 +41.3 97.82 perf-profile.calltrace.cycles-pp.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
41.87 +55.3 97.13 perf-profile.calltrace.cycles-pp.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
33.25 +63.5 96.74 perf-profile.calltrace.cycles-pp.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl
0.00 +94.4 94.37 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.locks_move_blocks.posix_lock_inode.do_lock_file_wait
0.00 +94.8 94.77 perf-profile.calltrace.cycles-pp._raw_spin_lock.locks_move_blocks.posix_lock_inode.do_lock_file_wait.fcntl_setlk
0.00 +94.8 94.79 perf-profile.calltrace.cycles-pp.locks_move_blocks.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl
24.02 -22.9 1.10 ± 2% perf-profile.children.cycles-pp.locks_alloc_lock
20.64 -19.7 0.96 perf-profile.children.cycles-pp.kmem_cache_alloc
12.01 -11.4 0.59 ± 2% perf-profile.children.cycles-pp.syscall_return_via_sysret
10.68 -10.2 0.52 ± 3% perf-profile.children.cycles-pp.entry_SYSCALL_64
9.14 -8.7 0.41 ± 2% perf-profile.children.cycles-pp.file_has_perm
7.08 -6.8 0.33 perf-profile.children.cycles-pp.memset_erms
6.55 -6.3 0.29 ± 2% perf-profile.children.cycles-pp.security_file_lock
5.98 -5.7 0.29 ± 2% perf-profile.children.cycles-pp.security_file_fcntl
5.57 -5.3 0.26 ± 3% perf-profile.children.cycles-pp.kmem_cache_free
5.30 -5.1 0.23 ± 2% perf-profile.children.cycles-pp._copy_from_user
4.51 -4.3 0.20 ± 2% perf-profile.children.cycles-pp.avc_has_perm
3.88 -3.7 0.19 ± 3% perf-profile.children.cycles-pp.___might_sleep
2.69 -2.6 0.11 perf-profile.children.cycles-pp.__might_sleep
2.65 -2.5 0.11 ± 3% perf-profile.children.cycles-pp.locks_free_lock
2.24 -2.1 0.11 ± 4% perf-profile.children.cycles-pp.locks_dispose_list
2.16 -2.1 0.11 ± 4% perf-profile.children.cycles-pp.__fget_light
1.97 -1.9 0.10 ± 4% perf-profile.children.cycles-pp.copy_user_generic_unrolled
1.88 -1.8 0.08 ± 8% perf-profile.children.cycles-pp.__might_fault
1.80 -1.7 0.07 ± 5% perf-profile.children.cycles-pp._cond_resched
1.73 -1.7 0.08 ± 5% perf-profile.children.cycles-pp.locks_delete_lock_ctx
1.56 -1.5 0.08 ± 5% perf-profile.children.cycles-pp.selinux_file_lock
1.38 -1.3 0.07 ± 6% perf-profile.children.cycles-pp.inode_has_perm
1.16 -1.1 0.05 ± 8% perf-profile.children.cycles-pp.locks_unlink_lock_ctx
1.16 -1.1 0.05 ± 8% perf-profile.children.cycles-pp.locks_release_private
1.21 -0.7 0.52 ± 3% perf-profile.children.cycles-pp.locks_insert_lock_ctx
76.99 +21.9 98.84 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
75.95 +22.8 98.79 perf-profile.children.cycles-pp.do_syscall_64
73.57 +25.1 98.63 perf-profile.children.cycles-pp.__x64_sys_fcntl
63.09 +35.0 98.12 perf-profile.children.cycles-pp.do_fcntl
56.82 +41.0 97.83 perf-profile.children.cycles-pp.fcntl_setlk
42.14 +55.0 97.14 perf-profile.children.cycles-pp.do_lock_file_wait
33.91 +62.9 96.78 perf-profile.children.cycles-pp.posix_lock_inode
3.67 +91.7 95.41 perf-profile.children.cycles-pp._raw_spin_lock
0.00 +94.4 94.40 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.00 +94.8 94.79 perf-profile.children.cycles-pp.locks_move_blocks
11.98 -11.4 0.59 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret
10.68 -10.2 0.52 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64
7.94 -7.6 0.38 ± 2% perf-profile.self.cycles-pp.kmem_cache_alloc
6.88 -6.6 0.32 perf-profile.self.cycles-pp.memset_erms
5.46 -5.2 0.25 ± 2% perf-profile.self.cycles-pp.kmem_cache_free
4.45 -4.2 0.20 perf-profile.self.cycles-pp.avc_has_perm
4.29 -4.1 0.20 ± 3% perf-profile.self.cycles-pp.posix_lock_inode
3.77 -3.6 0.18 ± 2% perf-profile.self.cycles-pp.___might_sleep
3.16 -3.0 0.14 perf-profile.self.cycles-pp.fcntl_setlk
3.06 -2.9 0.15 ± 3% perf-profile.self.cycles-pp.file_has_perm
2.80 -2.7 0.12 ± 6% perf-profile.self.cycles-pp.locks_alloc_lock
3.57 -2.6 1.00 ± 3% perf-profile.self.cycles-pp._raw_spin_lock
2.45 -2.4 0.10 perf-profile.self.cycles-pp.__might_sleep
2.21 -2.1 0.11 ± 3% perf-profile.self.cycles-pp.__x64_sys_fcntl
2.11 -2.0 0.11 ± 4% perf-profile.self.cycles-pp.__fget_light
2.08 -2.0 0.11 ± 3% perf-profile.self.cycles-pp.do_syscall_64
1.85 -1.8 0.09 perf-profile.self.cycles-pp.copy_user_generic_unrolled
1.55 -1.5 0.06 ± 6% perf-profile.self.cycles-pp.locks_free_lock
1.42 -1.3 0.07 perf-profile.self.cycles-pp.selinux_file_lock
1.29 -1.2 0.06 perf-profile.self.cycles-pp.do_lock_file_wait
1.27 -1.2 0.06 perf-profile.self.cycles-pp.inode_has_perm
1.23 -1.2 0.06 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
1.17 ± 3% -1.1 0.05 ± 9% perf-profile.self.cycles-pp.do_fcntl
0.00 +94.1 94.06 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
will-it-scale.per_process_ops
1.2e+06 +-+---------------------------------------------------------------+
| |
1e+06 +-+..+.+.+..+.+..+.+.+..+.+.+..+.+.+..+.+.+..+.+..+.+.+..+.+.+..+.|
| |
| |
800000 +-+ |
| |
600000 +-+ |
| |
400000 +-+ |
| |
| |
200000 +-+ |
| O O O O |
0 O-O--O-O-O--O-O--O-O-O--O-O-O--O-O-O--O-O-O-----------------------+
will-it-scale.workload
1e+08 +-+-----------------------------------------------------------------+
9e+07 +-+..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.|
| |
8e+07 +-+ |
7e+07 +-+ |
| |
6e+07 +-+ |
5e+07 +-+ |
4e+07 +-+ |
| |
3e+07 +-+ |
2e+07 +-+ |
| |
1e+07 +-+ O O O O |
0 O-O--O-O--O-O--O-O-O--O-O--O-O--O-O-O--O-O--O-----------------------+
[*] bisect-good sample
[O] bisect-bad sample
***************************************************************************************************
lkp-bdw-ep3d: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.2/thread/50%/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3d/lock1/will-it-scale
commit:
48a7a13ff3 ("locks: use properly initialized file_lock when unlocking.")
816f2fb5a2 ("fs/locks: allow a lock request to block other requests.")
48a7a13ff31f0728 816f2fb5a2fc678c2595ebf1bc
---------------- --------------------------
%stddev %change %stddev
\ | \
4156 -19.6% 3342 ± 3% cpuidle.C1.usage
283.50 ± 13% +107.5% 588.25 ± 31% turbostat.C1E
67074 +1.3% 67934 proc-vmstat.nr_active_anon
67074 +1.3% 67934 proc-vmstat.nr_zone_active_anon
2516 ± 6% +15.8% 2913 ± 6% slabinfo.kmalloc-512.active_objs
2582 ± 4% +15.9% 2992 ± 7% slabinfo.kmalloc-512.num_objs
18858 ± 19% -55.2% 8449 ± 65% numa-meminfo.node0.Inactive
18732 ± 19% -55.1% 8414 ± 65% numa-meminfo.node0.Inactive(anon)
2359 ± 4% -14.1% 2026 ± 10% numa-meminfo.node0.PageTables
34185 ± 3% -9.1% 31057 ± 4% numa-meminfo.node0.SUnreclaim
5128 ± 71% +203.9% 15585 ± 35% numa-meminfo.node1.Inactive
5118 ± 71% +202.6% 15485 ± 35% numa-meminfo.node1.Inactive(anon)
26927 ± 2% +13.9% 30675 ± 5% numa-meminfo.node1.SUnreclaim
4681 ± 19% -55.4% 2088 ± 65% numa-vmstat.node0.nr_inactive_anon
589.50 ± 4% -14.1% 506.25 ± 10% numa-vmstat.node0.nr_page_table_pages
8546 ± 3% -9.2% 7764 ± 4% numa-vmstat.node0.nr_slab_unreclaimable
4681 ± 19% -55.4% 2088 ± 65% numa-vmstat.node0.nr_zone_inactive_anon
1279 ± 71% +202.6% 3870 ± 35% numa-vmstat.node1.nr_inactive_anon
6731 ± 2% +13.9% 7668 ± 5% numa-vmstat.node1.nr_slab_unreclaimable
1279 ± 71% +202.6% 3870 ± 35% numa-vmstat.node1.nr_zone_inactive_anon
0.48 +0.0 0.52 ± 2% perf-stat.branch-miss-rate%
8.166e+09 ± 3% +9.1% 8.911e+09 ± 2% perf-stat.branch-misses
47.72 ± 2% -3.8 43.90 perf-stat.cache-miss-rate%
28734 +9.8% 31537 ± 4% perf-stat.cpu-migrations
0.00 +0.0 0.01 ± 20% perf-stat.dTLB-store-miss-rate%
32341984 +48.8% 48128200 ± 22% perf-stat.dTLB-store-misses
95.81 -7.1 88.71 ± 3% perf-stat.iTLB-load-miss-rate%
3.634e+09 ± 4% +15.2% 4.186e+09 perf-stat.iTLB-load-misses
1.612e+08 ± 31% +233.6% 5.378e+08 ± 28% perf-stat.iTLB-loads
2070 ± 2% -11.9% 1823 perf-stat.instructions-per-iTLB-miss
79.18 +2.9 82.12 perf-stat.node-store-miss-rate%
1.899e+09 +18.3% 2.247e+09 perf-stat.node-store-misses
4.993e+08 -2.0% 4.891e+08 perf-stat.node-stores
1236837 -2.9% 1200576 perf-stat.path-length
51831 +18.9% 61619 ± 7% sched_debug.cfs_rq:/.load.max
20947 ± 7% -9.2% 19025 sched_debug.cfs_rq:/.load.min
1.50 +14.1% 1.71 ± 5% sched_debug.cfs_rq:/.nr_spread_over.avg
20.00 ± 8% -9.6% 18.08 sched_debug.cfs_rq:/.runnable_load_avg.min
20939 ± 7% -9.4% 18977 sched_debug.cfs_rq:/.runnable_weight.min
-4681 +203.5% -14206 sched_debug.cfs_rq:/.spread0.min
315390 ± 4% -55.7% 139750 ± 59% sched_debug.cpu.avg_idle.min
155336 +38.2% 214672 ± 12% sched_debug.cpu.avg_idle.stddev
55.42 ± 2% +18.4% 65.62 ± 11% sched_debug.cpu.cpu_load[0].max
20.00 ± 8% -9.6% 18.08 sched_debug.cpu.cpu_load[0].min
52.67 ± 3% +18.2% 62.25 ± 10% sched_debug.cpu.cpu_load[1].max
19.67 ± 5% -7.4% 18.21 sched_debug.cpu.cpu_load[1].min
51831 +101.3% 104339 ± 68% sched_debug.cpu.load.max
20943 ± 7% -9.2% 19025 sched_debug.cpu.load.min
1512 ± 5% +35.1% 2042 ± 12% sched_debug.cpu.nr_load_updates.stddev
69159 ± 6% -20.7% 54830 ± 15% sched_debug.cpu.sched_count.max
13713 +14.9% 15759 ± 7% sched_debug.cpu.ttwu_count.max
81.19 -32.3 48.92 ± 16% perf-profile.calltrace.cycles-pp._raw_spin_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
79.63 -32.1 47.58 ± 16% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl
1.75 ± 2% -0.6 1.17 ± 14% perf-profile.calltrace.cycles-pp.fput.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
95.53 -0.4 95.16 perf-profile.calltrace.cycles-pp.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
95.93 -0.3 95.60 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
96.03 -0.3 95.73 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
1.81 +0.1 1.92 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
2.55 +0.1 2.69 ± 3% perf-profile.calltrace.cycles-pp.locks_alloc_lock.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl
1.67 +0.2 1.82 ± 2% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
0.00 +0.8 0.83 ± 22% perf-profile.calltrace.cycles-pp._raw_spin_lock.locks_insert_lock_ctx.posix_lock_inode.do_lock_file_wait.fcntl_setlk
0.00 +0.9 0.90 ± 20% perf-profile.calltrace.cycles-pp.locks_insert_lock_ctx.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl
0.00 +30.4 30.37 ± 25% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.locks_move_blocks.posix_lock_inode.do_lock_file_wait
0.00 +31.7 31.71 ± 24% perf-profile.calltrace.cycles-pp._raw_spin_lock.locks_move_blocks.posix_lock_inode.do_lock_file_wait.fcntl_setlk
0.00 +31.8 31.75 ± 24% perf-profile.calltrace.cycles-pp.locks_move_blocks.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl
5.41 +32.5 37.93 ± 21% perf-profile.calltrace.cycles-pp.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl
6.80 +32.6 39.36 ± 20% perf-profile.calltrace.cycles-pp.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
79.66 -1.7 77.98 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
1.75 -0.6 1.19 ± 14% perf-profile.children.cycles-pp.fput
95.57 -0.4 95.20 perf-profile.children.cycles-pp.__x64_sys_fcntl
95.98 -0.3 95.66 perf-profile.children.cycles-pp.do_syscall_64
96.13 -0.3 95.82 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
0.43 ± 3% -0.0 0.40 ± 2% perf-profile.children.cycles-pp.locks_free_lock
0.93 -0.0 0.89 perf-profile.children.cycles-pp.kmem_cache_free
0.29 -0.0 0.26 ± 5% perf-profile.children.cycles-pp._cond_resched
0.07 -0.0 0.04 ± 58% perf-profile.children.cycles-pp.bpf_fd_pass
0.20 ± 4% -0.0 0.18 ± 5% perf-profile.children.cycles-pp.locks_release_private
0.38 -0.0 0.36 perf-profile.children.cycles-pp.locks_dispose_list
0.15 ± 3% -0.0 0.13 ± 5% perf-profile.children.cycles-pp.rcu_all_qs
0.09 -0.0 0.08 ± 5% perf-profile.children.cycles-pp.vfs_lock_file
0.70 +0.0 0.75 ± 3% perf-profile.children.cycles-pp.avc_has_perm
0.00 +0.1 0.05 perf-profile.children.cycles-pp.memset
0.00 +0.1 0.06 ± 20% perf-profile.children.cycles-pp.ret_from_fork
0.00 +0.1 0.06 ± 20% perf-profile.children.cycles-pp.kthread
1.81 +0.1 1.92 perf-profile.children.cycles-pp.entry_SYSCALL_64
1.94 +0.2 2.11 ± 2% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.23 ± 2% +0.7 0.91 ± 20% perf-profile.children.cycles-pp.locks_insert_lock_ctx
0.00 +31.8 31.76 ± 24% perf-profile.children.cycles-pp.locks_move_blocks
5.52 +32.5 38.05 ± 21% perf-profile.children.cycles-pp.posix_lock_inode
6.86 +32.5 39.40 ± 20% perf-profile.children.cycles-pp.do_lock_file_wait
79.38 -1.7 77.72 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.75 ± 2% -0.6 1.17 ± 14% perf-profile.self.cycles-pp.fput
0.10 -0.0 0.06 ± 17% perf-profile.self.cycles-pp.locks_insert_lock_ctx
0.91 -0.0 0.87 perf-profile.self.cycles-pp.kmem_cache_free
0.08 -0.0 0.05 ± 8% perf-profile.self.cycles-pp.locks_delete_lock_ctx
0.24 -0.0 0.21 ± 2% perf-profile.self.cycles-pp.locks_free_lock
0.13 -0.0 0.11 ± 6% perf-profile.self.cycles-pp._copy_from_user
0.11 -0.0 0.10 ± 5% perf-profile.self.cycles-pp.rcu_all_qs
0.23 ± 4% +0.0 0.25 perf-profile.self.cycles-pp.selinux_file_lock
0.29 +0.0 0.34 ± 3% perf-profile.self.cycles-pp.do_syscall_64
0.69 +0.0 0.74 ± 3% perf-profile.self.cycles-pp.avc_has_perm
1.32 +0.1 1.39 ± 2% perf-profile.self.cycles-pp.kmem_cache_alloc
1.81 +0.1 1.92 perf-profile.self.cycles-pp.entry_SYSCALL_64
1.94 +0.2 2.11 ± 2% perf-profile.self.cycles-pp.syscall_return_via_sysret
1.96 ± 2% +1.9 3.83 ± 5% perf-profile.self.cycles-pp._raw_spin_lock
***************************************************************************************************
lkp-bdw-ep3b: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.2/thread/100%/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3b/lock1/will-it-scale/0xb00002e
commit:
48a7a13ff3 ("locks: use properly initialized file_lock when unlocking.")
816f2fb5a2 ("fs/locks: allow a lock request to block other requests.")
48a7a13ff31f0728 816f2fb5a2fc678c2595ebf1bc
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
:2 50% 1:4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
141975 -2.1% 138940 will-it-scale.time.involuntary_context_switches
398074 ± 4% -6.7% 371230 ± 4% softirqs.SCHED
7.54 +2.1% 7.70 turbostat.RAMWatt
7.55 ± 5% -11.8% 6.66 ± 8% sched_debug.cpu.clock.stddev
7.55 ± 5% -11.8% 6.66 ± 8% sched_debug.cpu.clock_task.stddev
2.50 ± 66% -66.7% 0.83 ±173% sched_debug.cpu.sched_goidle.min
258464 ± 12% +53.2% 396067 ± 34% numa-numastat.node0.local_node
267018 ± 8% +49.9% 400367 ± 33% numa-numastat.node0.numa_hit
392893 ± 7% -35.3% 254215 ± 53% numa-numastat.node1.local_node
401503 ± 5% -33.5% 267104 ± 50% numa-numastat.node1.numa_hit
2959 ± 5% -9.2% 2688 ± 4% slabinfo.eventpoll_pwq.active_objs
2959 ± 5% -9.2% 2688 ± 4% slabinfo.eventpoll_pwq.num_objs
4847 -8.1% 4456 slabinfo.kmalloc-1k.num_objs
1439 -15.9% 1209 ± 5% slabinfo.task_group.active_objs
1439 -15.9% 1209 ± 5% slabinfo.task_group.num_objs
58172 ± 22% +64.4% 95627 ± 16% numa-meminfo.node0.AnonHugePages
100354 ± 19% +35.8% 136280 ± 5% numa-meminfo.node0.AnonPages
186557 ± 3% -26.7% 136660 ± 21% numa-meminfo.node1.Active
186555 ± 3% -26.7% 136656 ± 21% numa-meminfo.node1.Active(anon)
120749 ± 12% -32.5% 81448 ± 18% numa-meminfo.node1.AnonHugePages
155979 ± 14% -23.8% 118836 ± 5% numa-meminfo.node1.AnonPages
25092 ± 19% +35.8% 34083 ± 5% numa-vmstat.node0.nr_anon_pages
542900 ± 2% +18.6% 644147 ± 10% numa-vmstat.node0.numa_hit
534282 +19.8% 639812 ± 10% numa-vmstat.node0.numa_local
46651 ± 3% -26.8% 34140 ± 21% numa-vmstat.node1.nr_active_anon
38983 ± 14% -23.8% 29693 ± 5% numa-vmstat.node1.nr_anon_pages
46651 ± 3% -26.8% 34140 ± 21% numa-vmstat.node1.nr_zone_active_anon
588746 ± 2% -16.5% 491444 ± 13% numa-vmstat.node1.numa_hit
418957 -24.3% 317150 ± 21% numa-vmstat.node1.numa_local
40.65 -0.6 40.04 perf-stat.cache-miss-rate%
7.64e+09 ± 4% +10.1% 8.415e+09 ± 2% perf-stat.cache-misses
1.88e+10 ± 4% +11.8% 2.101e+10 perf-stat.cache-references
323411 -2.0% 316921 perf-stat.context-switches
0.00 +0.0 0.00 ± 8% perf-stat.dTLB-store-miss-rate%
27829496 +12.8% 31382599 ± 9% perf-stat.dTLB-store-misses
91.66 ± 8% -10.9 80.76 perf-stat.iTLB-load-miss-rate%
3.743e+08 ± 93% +148.5% 9.3e+08 ± 6% perf-stat.iTLB-loads
79.18 +3.2 82.36 perf-stat.node-store-miss-rate%
1.876e+09 +23.6% 2.319e+09 perf-stat.node-store-misses
94.71 -33.7 61.02 ± 10% perf-profile.calltrace.cycles-pp._raw_spin_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
94.31 -33.7 60.64 ± 10% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.fcntl_setlk.do_fcntl.__x64_sys_fcntl
98.80 -0.0 98.76 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
98.68 -0.0 98.64 perf-profile.calltrace.cycles-pp.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe
98.83 -0.0 98.79 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
0.00 +0.5 0.51 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
0.00 +33.3 33.29 ± 19% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.locks_move_blocks.posix_lock_inode.do_lock_file_wait
0.00 +33.6 33.61 ± 19% perf-profile.calltrace.cycles-pp._raw_spin_lock.locks_move_blocks.posix_lock_inode.do_lock_file_wait.fcntl_setlk
0.00 +33.6 33.63 ± 19% perf-profile.calltrace.cycles-pp.locks_move_blocks.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl
1.98 +33.8 35.73 ± 18% perf-profile.calltrace.cycles-pp.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl.do_syscall_64
1.58 +33.8 35.34 ± 18% perf-profile.calltrace.cycles-pp.posix_lock_inode.do_lock_file_wait.fcntl_setlk.do_fcntl.__x64_sys_fcntl
94.34 -0.4 93.97 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.46 -0.1 0.35 ± 11% perf-profile.children.cycles-pp.fput
98.83 -0.0 98.79 perf-profile.children.cycles-pp.do_syscall_64
98.87 -0.0 98.83 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
98.69 -0.0 98.65 perf-profile.children.cycles-pp.__x64_sys_fcntl
0.27 -0.0 0.24 ± 3% perf-profile.children.cycles-pp.kmem_cache_free
0.06 -0.0 0.05 perf-profile.children.cycles-pp.locks_release_private
0.05 +0.0 0.06 perf-profile.children.cycles-pp.locks_unlink_lock_ctx
0.07 +0.0 0.08 ± 5% perf-profile.children.cycles-pp.locks_delete_lock_ctx
0.56 +0.0 0.59 perf-profile.children.cycles-pp.syscall_return_via_sysret
0.06 ± 9% +0.2 0.24 ± 15% perf-profile.children.cycles-pp.locks_insert_lock_ctx
0.00 +33.6 33.63 ± 19% perf-profile.children.cycles-pp.locks_move_blocks
2.00 +33.8 35.74 ± 18% perf-profile.children.cycles-pp.do_lock_file_wait
1.60 +33.8 35.37 ± 18% perf-profile.children.cycles-pp.posix_lock_inode
94.03 -0.4 93.64 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.46 -0.1 0.35 ± 10% perf-profile.self.cycles-pp.fput
0.27 -0.0 0.24 perf-profile.self.cycles-pp.kmem_cache_free
0.12 -0.0 0.11 ± 4% perf-profile.self.cycles-pp.__might_sleep
0.56 +0.0 0.59 perf-profile.self.cycles-pp.syscall_return_via_sysret
0.51 +0.5 1.00 ± 2% perf-profile.self.cycles-pp._raw_spin_lock
***************************************************************************************************
lkp-bdw-ep3: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
filesystem/gcc-7/performance/1HDD/x86_64-rhel-7.2/100%/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3/stress-ng/1s/0xb00002e
commit:
48a7a13ff3 ("locks: use properly initialized file_lock when unlocking.")
816f2fb5a2 ("fs/locks: allow a lock request to block other requests.")
48a7a13ff31f0728 816f2fb5a2fc678c2595ebf1bc
---------------- --------------------------
%stddev %change %stddev
\ | \
54596710 ± 2% -93.2% 3733496 ± 3% stress-ng.flock.ops
53929186 ± 2% -93.2% 3689099 ± 3% stress-ng.flock.ops_per_sec
259.05 -5.8% 244.01 stress-ng.time.user_time
2559 ± 6% +10.5% 2828 ± 11% boot-time.idle
122946 ± 20% -18.6% 100119 ± 25% cpuidle.POLL.usage
3149 ± 34% +110.7% 6636 ± 33% numa-meminfo.node1.PageTables
812.00 ± 33% +105.3% 1666 ± 36% numa-vmstat.node1.nr_page_table_pages
24479 ± 8% +14.1% 27936 ± 12% numa-vmstat.node1.nr_slab_unreclaimable
1016 ± 11% +25.1% 1271 ± 11% slabinfo.Acpi-ParseExt.active_objs
1016 ± 11% +25.1% 1271 ± 11% slabinfo.Acpi-ParseExt.num_objs
12364 ± 5% +6.9% 13219 ± 6% slabinfo.avtab_node.active_objs
12364 ± 5% +6.9% 13219 ± 6% slabinfo.avtab_node.num_objs
760.50 ± 22% -39.6% 459.00 ± 13% slabinfo.skbuff_fclone_cache.active_objs
760.50 ± 22% -39.6% 459.00 ± 13% slabinfo.skbuff_fclone_cache.num_objs
3.80 ± 32% -49.1% 1.93 ± 12% sched_debug.cpu.cpu_load[2].avg
14.93 ± 65% -66.8% 4.95 ± 30% sched_debug.cpu.cpu_load[2].stddev
4.03 ± 24% -44.6% 2.23 ± 7% sched_debug.cpu.cpu_load[3].avg
3.36 ± 30% -43.2% 1.91 ± 4% sched_debug.cpu.cpu_load[4].avg
12204 ± 6% +13.7% 13877 ± 3% sched_debug.cpu.nr_load_updates.avg
11373 ± 3% +7.2% 12189 ± 4% sched_debug.cpu.nr_load_updates.min
900.36 ± 11% +20.8% 1087 ± 12% sched_debug.cpu.nr_switches.stddev
7.254e+11 -3.4% 7.004e+11 perf-stat.branch-instructions
7.726e+09 ± 2% -5.8% 7.282e+09 perf-stat.branch-misses
2.871e+10 ± 3% -6.4% 2.688e+10 perf-stat.cache-references
2.58 +1.3% 2.61 perf-stat.cpi
8.942e+12 -2.4% 8.726e+12 perf-stat.cpu-cycles
9.844e+11 -3.8% 9.47e+11 perf-stat.dTLB-loads
4.235e+11 -7.1% 3.934e+11 perf-stat.dTLB-stores
84.76 +7.3 92.04 perf-stat.iTLB-load-miss-rate%
2.804e+09 -5.5% 2.65e+09 perf-stat.iTLB-load-misses
5.042e+08 ± 2% -54.5% 2.295e+08 ± 11% perf-stat.iTLB-loads
3.472e+12 -3.6% 3.347e+12 perf-stat.instructions
0.39 -1.2% 0.38 perf-stat.ipc
57.31 ± 16% -21.8 35.52 ± 19% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
59.81 ± 15% -21.7 38.13 ± 17% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
0.00 +8.4 8.44 ±103% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__ioctl.perf_evlist__disable.cmd_record.run_builtin
0.00 +8.4 8.44 ±103% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__ioctl.perf_evlist__disable.cmd_record
0.00 +8.4 8.44 ±103% perf-profile.calltrace.cycles-pp.__ioctl.perf_evlist__disable.cmd_record.run_builtin.main
0.00 +8.4 8.44 ±103% perf-profile.calltrace.cycles-pp.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.__ioctl.perf_evlist__disable
0.00 +8.4 8.44 ±103% perf-profile.calltrace.cycles-pp.perf_evlist__disable.cmd_record.run_builtin.main.generic_start_main
57.31 ± 16% -20.0 37.30 ± 20% perf-profile.children.cycles-pp.intel_idle
1.38 ±114% +2.5 3.86 ± 22% perf-profile.children.cycles-pp.seq_read
0.00 +8.4 8.44 ±103% perf-profile.children.cycles-pp.perf_evlist__disable
31.08 ± 5% +17.4 48.48 ± 20% perf-profile.children.cycles-pp.do_syscall_64
31.08 ± 5% +18.0 49.12 ± 20% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
57.31 ± 16% -20.0 37.30 ± 20% perf-profile.self.cycles-pp.intel_idle
38563 ± 12% +49.7% 57722 ± 39% proc-vmstat.nr_active_anon
37804 ± 12% +50.2% 56782 ± 39% proc-vmstat.nr_anon_pages
35.25 ± 23% +97.9% 69.75 ± 60% proc-vmstat.nr_anon_transparent_hugepages
107.25 ± 19% -34.7% 70.00 ± 10% proc-vmstat.nr_dirtied
328918 ± 4% +6.0% 348537 ± 2% proc-vmstat.nr_file_pages
5888 +3.2% 6075 proc-vmstat.nr_inactive_anon
8789 +2.8% 9036 proc-vmstat.nr_mapped
6916 +2.9% 7115 proc-vmstat.nr_shmem
321790 ± 4% +6.0% 341178 ± 2% proc-vmstat.nr_unevictable
38563 ± 12% +49.7% 57722 ± 39% proc-vmstat.nr_zone_active_anon
5888 +3.2% 6075 proc-vmstat.nr_zone_inactive_anon
321790 ± 4% +6.0% 341179 ± 2% proc-vmstat.nr_zone_unevictable
5348 ± 12% +36.1% 7279 ± 14% proc-vmstat.numa_hint_faults
41707 ± 22% +259.4% 149913 ±104% proc-vmstat.numa_pte_updates
112842 -3.1% 109367 ± 2% proc-vmstat.pgactivate
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-4.20.0-rc2-00007-g816f2fb" of type "text/plain" (168529 bytes)
View attachment "job-script" of type "text/plain" (7181 bytes)
View attachment "job.yaml" of type "text/plain" (4803 bytes)
View attachment "reproduce" of type "text/plain" (309 bytes)
Powered by blists - more mailing lists