[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180508053451.GD30203@yexl-desktop>
Date: Tue, 8 May 2018 13:34:51 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: linux-kernel@...r.kernel.org, lkp@...org
Subject: [lkp-robot] [mm] e27be240df: will-it-scale.per_process_ops -27.2%
regression
Greeting,
FYI, we noticed a -27.2% regression of will-it-scale.per_process_ops due to commit:
commit: e27be240df53f1a20c659168e722b5d9f16cc7f4 ("mm: memcg: make sure memory.events is uptodate when waking pollers")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: will-it-scale
on test machine: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory
with following parameters:
nr_task: 100%
mode: process
test: page_fault3
cpufreq_governor: performance
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
Details are as below:
-------------------------------------------------------------------------------------------------->
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.2/process/100%/debian-x86_64-2016-08-31.cgz/lkp-hsw-ep4/page_fault3/will-it-scale
commit:
a38c015f31 ("mm/ksm.c: fix inconsistent accounting of zero pages")
e27be240df ("mm: memcg: make sure memory.events is uptodate when waking pollers")
a38c015f3156895b e27be240df53f1a20c659168e7
---------------- --------------------------
%stddev %change %stddev
\ | \
639324 -27.2% 465226 will-it-scale.per_process_ops
46031421 -27.2% 33496351 will-it-scale.workload
17.55 -3.2 14.38 mpstat.cpu.usr%
1130383 ± 6% -19.6% 909067 ± 4% softirqs.RCU
95892 ± 2% -7.5% 88706 ± 3% vmstat.system.in
2714 +2.0% 2768 turbostat.Avg_MHz
0.43 ± 9% -33.3% 0.29 ± 15% turbostat.CPU%c1
15.72 -2.5% 15.33 turbostat.RAMWatt
15220184 -26.9% 11118535 numa-numastat.node0.local_node
15223689 -26.9% 11125573 numa-numastat.node0.numa_hit
15236149 -22.2% 11857182 numa-numastat.node1.local_node
15246716 -22.2% 11864179 numa-numastat.node1.numa_hit
8676822 -22.6% 6714739 numa-vmstat.node0.numa_hit
8673095 -22.7% 6707502 numa-vmstat.node0.numa_local
8661159 -19.7% 6951620 numa-vmstat.node1.numa_hit
8481025 -20.1% 6775023 numa-vmstat.node1.numa_local
30466411 -24.6% 22979746 proc-vmstat.numa_hit
30452327 -24.6% 22965700 proc-vmstat.numa_local
30512939 -24.6% 23021801 proc-vmstat.pgalloc_normal
1.386e+10 -27.2% 1.008e+10 proc-vmstat.pgfault
28718588 ± 3% -24.0% 21818568 ± 5% proc-vmstat.pgfree
62.72 ± 10% -21.8% 49.06 ± 2% sched_debug.cfs_rq:/.exec_clock.stddev
80883 ± 10% -14.1% 69503 ± 6% sched_debug.cfs_rq:/.min_vruntime.stddev
2.04 ± 3% +10.0% 2.24 ± 2% sched_debug.cfs_rq:/.nr_spread_over.stddev
119225 ± 11% -58.0% 50132 ± 59% sched_debug.cfs_rq:/.spread0.avg
199133 ± 7% -35.3% 128853 ± 23% sched_debug.cfs_rq:/.spread0.max
80591 ± 10% -14.1% 69247 ± 6% sched_debug.cfs_rq:/.spread0.stddev
6.275e+12 -27.3% 4.565e+12 perf-stat.branch-instructions
4.772e+10 ± 2% -26.7% 3.498e+10 perf-stat.branch-misses
55.58 -20.5 35.13 perf-stat.cache-miss-rate%
2.658e+10 -20.4% 2.116e+10 perf-stat.cache-misses
4.782e+10 +26.0% 6.025e+10 perf-stat.cache-references
1.86 +40.3% 2.60 perf-stat.cpi
5.875e+13 +2.0% 5.994e+13 perf-stat.cpu-cycles
8.997e+12 -27.4% 6.532e+12 perf-stat.dTLB-loads
2.94 -0.5 2.48 perf-stat.dTLB-store-miss-rate%
1.599e+11 -38.9% 9.764e+10 perf-stat.dTLB-store-misses
5.27e+12 -27.2% 3.838e+12 perf-stat.dTLB-stores
2.684e+10 -27.3% 1.95e+10 perf-stat.iTLB-load-misses
3.166e+13 -27.3% 2.303e+13 perf-stat.instructions
0.54 -28.7% 0.38 perf-stat.ipc
1.386e+10 -27.2% 1.009e+10 perf-stat.minor-faults
0.57 ± 10% +10.9 11.49 perf-stat.node-load-miss-rate%
67281213 ± 10% +1624.2% 1.16e+09 perf-stat.node-load-misses
1.179e+10 -24.2% 8.934e+09 perf-stat.node-loads
5.02 +0.6 5.64 perf-stat.node-store-miss-rate%
7.36e+08 -15.5% 6.216e+08 perf-stat.node-store-misses
1.393e+10 -25.3% 1.041e+10 perf-stat.node-stores
1.386e+10 -27.2% 1.009e+10 perf-stat.page-faults
will-it-scale.per_process_ops
700000 +-+----------------------------------------------------------------+
|.+ .+.+ +.+ |
650000 +-++ .+.+. .+ + : + |
| + .+.+ + +.. : + +.+.+.+..+.+.+.+.+.|
| + .+..+ : + + |
600000 +-+ +.+ +.+ +.+ |
| |
550000 +-+ |
| |
500000 +-+ |
O |
| O O O O O O O O O O O O O O O O O O O |
450000 +-+ O |
| O O O |
400000 +-+---O------------------------------------------------------------+
will-it-scale.workload
5e+07 +-+---------------------------------------------------------------+
4.8e+07 +-+ .+.+ +.+ |
| + .+.+.+.+ + : + .+.+.+. .+. .+.+.|
4.6e+07 +-+ +. .+..+ +. : +. +. + + |
4.4e+07 +-+ +. .+.+ +. : +. + |
4.2e+07 +-+ + + + |
4e+07 +-+ |
| |
3.8e+07 +-+ |
3.6e+07 +-+ |
3.4e+07 O-O O O O O O O O O O O O O |
3.2e+07 +-+ O O O O O O |
| O |
3e+07 +-+ O O O O |
2.8e+07 +-+---------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
View attachment "config-4.16.0-10982-ge27be24" of type "text/plain" (164002 bytes)
View attachment "job-script" of type "text/plain" (7020 bytes)
View attachment "job.yaml" of type "text/plain" (4671 bytes)
View attachment "reproduce" of type "text/plain" (305 bytes)
Powered by blists - more mailing lists