[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20170204070804.GC12121@yexl-desktop>
Date: Sat, 4 Feb 2017 15:08:04 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Manfred Spraul <manfred@...orfullife.com>
Cc: Stephen Rothwell <sfr@...b.auug.org.au>,
Peter Zijlstra <peterz@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
kernel test robot <xiaolong.ye@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp-robot] [ipc/sem.c] f4b5bafaf7: aim9.shared_memory.ops_per_sec
11.3% improvement
Greeting,
FYI, we noticed a 11.3% improvement of aim9.shared_memory.ops_per_sec due to commit:
commit: f4b5bafaf7c0a3b2f204e48c07b5335ed93266fa ("ipc/sem.c: avoid using spin_unlock_wait()")
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
in testcase: aim9
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 64G memory
with following parameters:
testtime: 300s
test: shared_memory
cpufreq_governor: performance
test-description: Suite IX is the "AIM Independent Resource Benchmark:" the famous synthetic benchmark.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/
In addition to that, the commit also has significant impact on the following tests:
+------------------+------------------------------------------------------------------+
| testcase: change | aim9: aim9.shared_memory.ops_per_sec 11.5% improvement |
| test machine | 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory |
| test parameters | cpufreq_governor=performance |
| | test=shared_memory |
| | testtime=300s |
+------------------+------------------------------------------------------------------+
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: aim9/300s-shared_memory-performance/ivb43
6487b8d2876d7d39 f4b5bafaf7c0a3b2f204e48c07
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
1073533 ± 0% +11.3% 1194345 ± 0% aim9.shared_memory.ops_per_sec
3221639 ± 0% +11.2% 3584021 ± 0% aim9.time.minor_page_faults
28206 ± 8% -12.5% 24690 ± 0% meminfo.Active(file)
3.56 ± 1% -5.0% 3.38 ± 4% turbostat.RAMWatt
14128 ± 9% -12.8% 12326 ± 1% numa-meminfo.node0.Active(file)
14081 ± 8% -12.2% 12365 ± 0% numa-meminfo.node1.Active(file)
7051 ± 8% -12.5% 6172 ± 0% proc-vmstat.nr_active_file
7051 ± 8% -12.5% 6172 ± 0% proc-vmstat.nr_zone_active_file
3221639 ± 0% +11.2% 3584021 ± 0% time.minor_page_faults
41.48 ± 1% +9.7% 45.50 ± 1% time.user_time
3531 ± 9% -12.7% 3081 ± 0% numa-vmstat.node0.nr_active_file
3531 ± 9% -12.7% 3081 ± 0% numa-vmstat.node0.nr_zone_active_file
3520 ± 8% -12.2% 3091 ± 0% numa-vmstat.node1.nr_active_file
3520 ± 8% -12.2% 3091 ± 0% numa-vmstat.node1.nr_zone_active_file
1.26 ± 16% -70.4% 0.37 ± 71% perf-profile.calltrace.cycles-pp.pid_vnr.SYSC_semtimedop.sys_semop.entry_SYSCALL_64_fastpath
1.38 ± 18% -53.1% 0.65 ± 8% perf-profile.children.cycles-pp.pid_vnr
8.29 ± 8% -37.2% 5.20 ± 10% perf-profile.self.cycles-pp.SYSC_semtimedop
1.37 ± 19% -57.5% 0.58 ± 14% perf-profile.self.cycles-pp.pid_vnr
76641 ± 27% +64.8% 126335 ± 25% slabinfo.kmalloc-8.active_objs
76927 ± 27% +64.5% 126565 ± 25% slabinfo.kmalloc-8.num_objs
839.50 ± 4% -13.0% 730.00 ± 8% slabinfo.nsproxy.active_objs
839.50 ± 4% -13.0% 730.00 ± 8% slabinfo.nsproxy.num_objs
15877 ± 4% -6.7% 14819 ± 4% slabinfo.vm_area_struct.active_objs
15877 ± 4% -6.7% 14819 ± 4% slabinfo.vm_area_struct.num_objs
0.09 ±110% +188.8% 0.26 ± 31% sched_debug.cfs_rq:/.nr_spread_over.stddev
12.61 ± 31% -35.8% 8.10 ± 31% sched_debug.cfs_rq:/.removed_util_avg.stddev
7.42 ± 48% -48.3% 3.83 ± 83% sched_debug.cfs_rq:/.util_avg.min
341584 ± 4% +31.4% 448942 ± 5% sched_debug.cpu.avg_idle.min
138800 ± 3% -8.5% 127032 ± 1% sched_debug.cpu.avg_idle.stddev
1134 ± 12% -23.0% 873.83 ± 16% sched_debug.cpu.nr_switches.min
628.08 ± 39% -66.7% 209.39 ± 59% sched_debug.cpu.sched_count.min
215.04 ± 61% -91.7% 17.89 ± 88% sched_debug.cpu.sched_goidle.min
132.92 ± 30% -43.5% 75.06 ± 39% sched_debug.cpu.ttwu_count.min
3.713e+11 ± 4% +20.5% 4.476e+11 ± 7% perf-stat.branch-instructions
1.32 ± 5% -9.1% 1.20 ± 3% perf-stat.branch-miss-rate%
4.887e+09 ± 5% +9.4% 5.348e+09 ± 3% perf-stat.branch-misses
0.13 ± 6% -16.6% 0.11 ± 5% perf-stat.dTLB-load-miss-rate%
4.368e+11 ± 6% +17.7% 5.139e+11 ± 0% perf-stat.dTLB-loads
0.04 ± 10% -13.3% 0.04 ± 0% perf-stat.dTLB-store-miss-rate%
2.071e+12 ± 4% +20.4% 2.494e+12 ± 6% perf-stat.instructions
12681 ± 4% +18.0% 14964 ± 5% perf-stat.instructions-per-iTLB-miss
0.92 ± 1% +11.8% 1.03 ± 1% perf-stat.ipc
3784094 ± 0% +9.5% 4145210 ± 0% perf-stat.minor-faults
3784100 ± 0% +9.5% 4145210 ± 0% perf-stat.page-faults
perf-stat.page-faults
4.5e+06 ++----------------------------------------------------------------+
O OO O OO O O OO O O O O |
4e+06 *+**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*
3.5e+06 ++ |
| |
3e+06 ++ |
2.5e+06 ++ |
| |
2e+06 ++ |
1.5e+06 ++ |
| |
1e+06 ++ |
500000 ++ |
| |
0 ++-----------------O----------------------------------------------+
perf-stat.minor-faults
4.5e+06 ++----------------------------------------------------------------+
O OO O OO O O OO O O O O |
4e+06 *+**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*
3.5e+06 ++ |
| |
3e+06 ++ |
2.5e+06 ++ |
| |
2e+06 ++ |
1.5e+06 ++ |
| |
1e+06 ++ |
500000 ++ |
| |
0 ++-----------------O----------------------------------------------+
aim9.shared_memory.ops_per_sec
1.2e+06 O+OO-O-OO-O-O-OO-O--O-O-O-----------------------------------------+
*.**.*.**.*.*.**.*.**.*.**.*.* *.*.**. .**.*. *. *.*. *.*
1e+06 ++ *.*.*.* * * *.*.* * |
| |
| |
800000 ++ |
| |
600000 ++ |
| |
400000 ++ |
| |
| |
200000 ++ |
| |
0 ++-----------------O----------------------------------------------+
aim9.time.minor_page_faults
4e+06 ++----------------------------------------------------------------+
| O OO O O O |
3.5e+06 O+OO O OO.O.*. O*.*. |
3e+06 *+**.*.** **.*.**.*.* **.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*
| |
2.5e+06 ++ |
| |
2e+06 ++ |
| |
1.5e+06 ++ |
1e+06 ++ |
| |
500000 ++ |
| |
0 ++-----------------O----------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
View attachment "config-4.10.0-rc4-00447-gf4b5baf" of type "text/plain" (155600 bytes)
View attachment "job-script" of type "text/plain" (6468 bytes)
View attachment "job.yaml" of type "text/plain" (4138 bytes)
View attachment "reproduce" of type "text/plain" (103 bytes)
Powered by blists - more mailing lists