[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161017022504.GG22605@yexl-desktop>
Date: Mon, 17 Oct 2016 10:25:04 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Manfred Spraul <manfred@...orfullife.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp] [ipc/sem.c] 5864a2fd30: aim9.shared_memory.ops_per_sec
-13.0% regression
FYI, we noticed a -13.0% regression of aim9.shared_memory.ops_per_sec due to commit:
commit 5864a2fd3088db73d47942370d0f7210a807b9bc ("ipc/sem.c: fix complex_count vs. simple op race")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
in testcase: aim9
on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
with following parameters:
testtime: 300s
test: shared_memory
cpufreq_governor: performance
Suite IX is the "AIM Independent Resource Benchmark:" the famous synthetic benchmark.
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
gcc-6/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-hsx04/shared_memory/aim9/300s
commit:
65deb8af76 ("kcov: do not instrument lib/stackdepot.c")
5864a2fd30 ("ipc/sem.c: fix complex_count vs. simple op race")
65deb8af76defeae 5864a2fd3088db73d47942370d
---------------- --------------------------
%stddev %change %stddev
\ | \
1450659 ± 0% -13.0% 1262704 ± 0% aim9.shared_memory.ops_per_sec
4353113 ± 0% -13.0% 3789220 ± 0% aim9.time.minor_page_faults
251.75 ± 0% +2.5% 258.09 ± 0% aim9.time.system_time
48.25 ± 0% -13.1% 41.91 ± 0% aim9.time.user_time
967475 ± 8% -26.3% 713361 ± 9% cpuidle.C1-HSW.time
6351555 ± 7% +10.8% 7036941 ± 0% meminfo.DirectMap2M
4856604 ± 0% -40.6% 2882512 ± 65% numa-numastat.node0.local_node
7725 ± 74% +80.1% 13913 ± 0% numa-numastat.node0.numa_foreign
4856610 ± 0% -40.6% 2882518 ± 65% numa-numastat.node0.numa_hit
7725 ± 74% +80.1% 13913 ± 0% numa-numastat.node0.numa_miss
16976 ± 2% +15.2% 19550 ± 4% slabinfo.kmalloc-512.active_objs
17132 ± 2% +15.1% 19712 ± 4% slabinfo.kmalloc-512.num_objs
133771 ± 3% -13.6% 115576 ± 9% slabinfo.kmalloc-8.active_objs
134332 ± 3% -13.4% 116281 ± 9% slabinfo.kmalloc-8.num_objs
5553071 ± 0% -11.7% 4903071 ± 0% proc-vmstat.numa_hit
5553039 ± 0% -11.7% 4903039 ± 0% proc-vmstat.numa_local
7787767 ± 0% -11.2% 6913807 ± 0% proc-vmstat.pgalloc_normal
5088685 ± 0% -11.0% 4527573 ± 0% proc-vmstat.pgfault
7812098 ± 0% -11.3% 6930089 ± 0% proc-vmstat.pgfree
39721 ± 35% -46.2% 21378 ± 72% numa-meminfo.node1.Active
32736 ± 42% -54.8% 14793 ±108% numa-meminfo.node1.Active(anon)
25797 ± 50% -61.0% 10062 ±127% numa-meminfo.node1.AnonHugePages
32721 ± 43% -54.8% 14777 ±108% numa-meminfo.node1.AnonPages
916.00 ± 14% -50.5% 453.67 ± 18% numa-meminfo.node1.PageTables
15945 ± 11% +23.8% 19737 ± 10% numa-meminfo.node2.SReclaimable
647.67 ± 18% +60.8% 1041 ± 15% numa-meminfo.node3.PageTables
2195 ± 80% +128.1% 5008 ± 14% numa-meminfo.node3.Shmem
5.152e+09 ± 5% +6.4% 5.481e+09 ± 4% perf-stat.branch-misses
0.19 ± 6% +9.0% 0.20 ± 2% perf-stat.dTLB-load-miss-rate%
1.28e+09 ± 1% -3.7% 1.233e+09 ± 0% perf-stat.dTLB-load-misses
6.934e+11 ± 7% -12.0% 6.101e+11 ± 2% perf-stat.dTLB-loads
3.348e+08 ± 0% -2.3% 3.27e+08 ± 0% perf-stat.dTLB-store-misses
84.70 ± 1% -4.4% 81.00 ± 2% perf-stat.iTLB-load-miss-rate%
1.219e+09 ± 1% -16.5% 1.018e+09 ± 14% perf-stat.iTLB-load-misses
2.202e+08 ± 4% +6.6% 2.348e+08 ± 2% perf-stat.iTLB-loads
2102 ± 29% +39.9% 2941 ± 13% perf-stat.instructions-per-iTLB-miss
5064989 ± 0% -11.1% 4500438 ± 0% perf-stat.minor-faults
57745625 ± 10% -44.9% 31799796 ± 53% perf-stat.node-store-misses
9693133 ± 9% -20.9% 7668803 ± 23% perf-stat.node-stores
5064525 ± 0% -11.1% 4500444 ± 0% perf-stat.page-faults
2.35 ± 23% -24.4% 1.78 ± 14% perf-profile.calltrace.cycles-pp.__tick_nohz_idle_enter.tick_nohz_irq_exit.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt
3.00 ± 23% -26.4% 2.21 ± 14% perf-profile.calltrace.cycles-pp.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
2.42 ± 23% -24.3% 1.83 ± 14% perf-profile.calltrace.cycles-pp.tick_nohz_irq_exit.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter
2.44 ± 23% -24.5% 1.84 ± 14% perf-profile.children.cycles-pp.__tick_nohz_idle_enter
0.96 ± 17% -24.9% 0.72 ± 19% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
0.72 ± 28% -27.2% 0.53 ± 14% perf-profile.children.cycles-pp.hrtimer_start_range_ns
3.13 ± 23% -25.1% 2.34 ± 14% perf-profile.children.cycles-pp.irq_exit
1.01 ± 76% -65.2% 0.35 ± 24% perf-profile.children.cycles-pp.rest_init
1.01 ± 76% -65.2% 0.35 ± 24% perf-profile.children.cycles-pp.start_kernel
2.49 ± 24% -24.3% 1.88 ± 13% perf-profile.children.cycles-pp.tick_nohz_irq_exit
1.01 ± 76% -65.2% 0.35 ± 24% perf-profile.children.cycles-pp.x86_64_start_kernel
1.01 ± 76% -65.2% 0.35 ± 24% perf-profile.children.cycles-pp.x86_64_start_reservations
2.34 ± 16% +48.1% 3.47 ± 16% perf-profile.self.cycles-pp.SYSC_semtimedop
0.96 ± 17% -24.9% 0.72 ± 19% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
33058 ± 17% +19.8% 39614 ± 1% numa-vmstat.node0.numa_foreign
2553587 ± 0% -38.9% 1560692 ± 59% numa-vmstat.node0.numa_hit
2553582 ± 0% -38.9% 1560686 ± 59% numa-vmstat.node0.numa_local
33058 ± 17% +19.8% 39614 ± 1% numa-vmstat.node0.numa_miss
8185 ± 42% -54.8% 3699 ±108% numa-vmstat.node1.nr_active_anon
8179 ± 43% -54.8% 3693 ±108% numa-vmstat.node1.nr_anon_pages
226.33 ± 15% -50.7% 111.67 ± 19% numa-vmstat.node1.nr_page_table_pages
8186 ± 42% -54.8% 3699 ±108% numa-vmstat.node1.nr_zone_active_anon
3986 ± 11% +23.8% 4933 ± 10% numa-vmstat.node2.nr_slab_reclaimable
161.00 ± 18% +61.1% 259.33 ± 15% numa-vmstat.node3.nr_page_table_pages
548.33 ± 80% +128.3% 1251 ± 14% numa-vmstat.node3.nr_shmem
88775 ± 4% -7.1% 82488 ± 4% numa-vmstat.node3.numa_foreign
177250 ± 9% +378.1% 847409 ±109% numa-vmstat.node3.numa_hit
177245 ± 9% +378.1% 847405 ±109% numa-vmstat.node3.numa_local
88775 ± 4% -7.1% 82488 ± 4% numa-vmstat.node3.numa_miss
0.00 ± 0% +3.7e+09% 36.92 ± 38% sched_debug.cfs_rq:/.MIN_vruntime.avg
0.00 ± 0% +3.6e+11% 3602 ± 15% sched_debug.cfs_rq:/.MIN_vruntime.max
0.00 ± 0% +1.8e+25% 345.99 ± 12% sched_debug.cfs_rq:/.MIN_vruntime.stddev
191.05 ± 2% +9.4% 208.96 ± 4% sched_debug.cfs_rq:/.load_avg.avg
573.00 ± 16% +53.3% 878.22 ± 11% sched_debug.cfs_rq:/.load_avg.max
49.42 ± 14% +51.3% 74.77 ± 9% sched_debug.cfs_rq:/.load_avg.stddev
0.00 ± 0% +3.7e+09% 36.92 ± 38% sched_debug.cfs_rq:/.max_vruntime.avg
0.00 ± 0% +3.6e+11% 3602 ± 15% sched_debug.cfs_rq:/.max_vruntime.max
0.00 ± 0% +1.8e+25% 345.99 ± 12% sched_debug.cfs_rq:/.max_vruntime.stddev
849009 ± 9% -36.3% 540564 ± 25% sched_debug.cfs_rq:/.min_vruntime.max
69571 ± 9% -36.5% 44149 ± 26% sched_debug.cfs_rq:/.min_vruntime.stddev
0.19 ± 40% -57.0% 0.08 ± 89% sched_debug.cfs_rq:/.nr_spread_over.stddev
2.85 ± 29% +71.1% 4.87 ± 11% sched_debug.cfs_rq:/.runnable_load_avg.avg
313.78 ± 36% +96.8% 617.61 ± 14% sched_debug.cfs_rq:/.runnable_load_avg.max
27.10 ± 35% +90.5% 51.62 ± 13% sched_debug.cfs_rq:/.runnable_load_avg.stddev
813659 ± 9% -38.0% 504715 ± 27% sched_debug.cfs_rq:/.spread0.max
69571 ± 9% -36.5% 44151 ± 26% sched_debug.cfs_rq:/.spread0.stddev
65.19 ± 9% -62.2% 24.62 ± 27% sched_debug.cpu.clock.stddev
65.19 ± 9% -62.2% 24.62 ± 27% sched_debug.cpu.clock_task.stddev
2.84 ± 29% +70.4% 4.85 ± 11% sched_debug.cpu.cpu_load[0].avg
313.78 ± 36% +96.8% 617.61 ± 14% sched_debug.cpu.cpu_load[0].max
27.10 ± 35% +90.5% 51.61 ± 13% sched_debug.cpu.cpu_load[0].stddev
3.00 ± 34% +73.6% 5.21 ± 7% sched_debug.cpu.cpu_load[1].avg
312.39 ± 36% +102.1% 631.44 ± 12% sched_debug.cpu.cpu_load[1].max
27.24 ± 36% +94.8% 53.07 ± 11% sched_debug.cpu.cpu_load[1].stddev
2.88 ± 34% +78.3% 5.13 ± 8% sched_debug.cpu.cpu_load[2].avg
310.06 ± 37% +104.0% 632.61 ± 12% sched_debug.cpu.cpu_load[2].max
26.75 ± 36% +98.8% 53.18 ± 11% sched_debug.cpu.cpu_load[2].stddev
2.68 ± 34% +86.8% 5.01 ± 9% sched_debug.cpu.cpu_load[3].avg
303.11 ± 37% +110.0% 636.39 ± 12% sched_debug.cpu.cpu_load[3].max
25.82 ± 36% +106.2% 53.24 ± 11% sched_debug.cpu.cpu_load[3].stddev
2.44 ± 33% +99.3% 4.87 ± 9% sched_debug.cpu.cpu_load[4].avg
289.67 ± 36% +119.5% 635.78 ± 12% sched_debug.cpu.cpu_load[4].max
24.41 ± 35% +117.4% 53.07 ± 11% sched_debug.cpu.cpu_load[4].stddev
0.00 ± 12% -54.4% 0.00 ± 19% sched_debug.cpu.next_balance.stddev
44337 ± 46% +66.1% 73663 ± 4% sched_debug.cpu.nr_switches.max
5304 ± 31% +34.0% 7105 ± 5% sched_debug.cpu.nr_switches.stddev
-21.06 ±-17% -38.5% -12.94 ±-35% sched_debug.cpu.nr_uninterruptible.min
20897 ± 48% +72.1% 35970 ± 4% sched_debug.cpu.sched_goidle.max
2524 ± 33% +37.8% 3478 ± 5% sched_debug.cpu.sched_goidle.stddev
21067 ± 45% +88.3% 39667 ± 11% sched_debug.cpu.ttwu_count.max
2771 ± 34% +39.0% 3851 ± 8% sched_debug.cpu.ttwu_count.stddev
10886 ± 41% +117.4% 23669 ± 14% sched_debug.cpu.ttwu_local.max
1101 ± 31% +84.7% 2033 ± 15% sched_debug.cpu.ttwu_local.stddev
perf-stat.page-faults
6e+06 ++------------------------------------------------------------------+
| |
5e+06 **.****.***.****.***.****.***.****.****.***.****.***.****.***.** *.**
OO OO O OOO OOOO OOO OOOO OOO OOOO OOO OOO OO O O : : |
| O O O O : : |
4e+06 ++ : : |
| : : |
3e+06 ++ : : |
| :: |
2e+06 ++ : |
| : |
| : |
1e+06 ++ : |
| : |
0 ++-----------------------------------------------O--------------*---+
perf-stat.minor-faults
6e+06 ++------------------------------------------------------------------+
| |
5e+06 **.****.***.****.***.****.***.****.****.***.****.***.****.***.** *.**
OO OO O OOO OOOO OOO OOOO OOO OOOO OOO OOO OO O O : : |
| O O O O : : |
4e+06 ++ : : |
| : : |
3e+06 ++ : : |
| :: |
2e+06 ++ : |
| : |
| : |
1e+06 ++ : |
| : |
0 ++-----------------------------------------------O--------------*---+
aim9.shared_memory.ops_per_sec
1.6e+06 ++----------------------------------------------------------------+
** .** .*** .** *.* **. ***.****.****.* **. ** .** *.****.* * ***
1.4e+06 ++* ** * * * * * * * * *: : |
OOO OOOO OOOO OOOO OOOO OOOO OOOO OOOO OOOO OOO OO : : |
1.2e+06 ++ : : |
1e+06 ++ : : |
| : : |
800000 ++ : : |
| :: |
600000 ++ :: |
400000 ++ :: |
| : |
200000 ++ : |
| : |
0 ++---------------------------------------------O--------------*---+
aim9.time.minor_page_faults
4.5e+06 ++-----------------------*----***-**-------------------*-*-*-*--***
***.****.****.****.****.* **.* **.****.****.****.* * *: : |
4e+06 OOO OOOO OOOO OOOO OOOO OOOO OOOO OOOO OOOO OOO OO : : |
3.5e+06 ++ : : |
| : : |
3e+06 ++ : : |
2.5e+06 ++ : : |
| :: |
2e+06 ++ :: |
1.5e+06 ++ :: |
| :: |
1e+06 ++ : |
500000 ++ : |
| : |
0 ++---------------------------------------------O--------------*---+
[*] bisect-good sample
[O] bisect-bad sample
Thanks,
Xiaolong
View attachment "config-4.8.0-11974-g5864a2f" of type "text/plain" (153666 bytes)
View attachment "job-script" of type "text/plain" (6516 bytes)
View attachment "job.yaml" of type "text/plain" (4084 bytes)
View attachment "reproduce" of type "text/plain" (103 bytes)
Powered by blists - more mailing lists