[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170814060507.GE23258@yexl-desktop>
Date: Mon, 14 Aug 2017 14:05:07 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Guillaume Knispel <guillaume.knispel@...ersonicimagine.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Manfred Spraul <manfred@...orfullife.com>,
Kees Cook <keescook@...omium.org>,
Davidlohr Bueso <dave@...olabs.net>,
Alexey Dobriyan <adobriyan@...il.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Serge Hallyn <serge@...lyn.com>,
Andrey Vagin <avagin@...nvz.org>,
Guillaume Knispel <guillaume.knispel@...ersonicimagine.com>,
Marc Pardo <marc.pardo@...ersonicimagine.com>,
linux-kernel@...r.kernel.org, lkp@...org
Subject: [lkp-robot] [ipc] cb6268f05d: reaim.jobs_per_min 865% improvement
Greeting,
FYI, we noticed a 865% improvement of reaim.jobs_per_min due to commit:
commit: cb6268f05df684e00607762fd8ad95d515e2407f ("ipc: optimize semget/shmget/msgget for lots of keys")
url: https://github.com/0day-ci/linux/commits/Guillaume-Knispel/ipc-optimize-semget-shmget-msgget-for-lots-of-keys/20170731-170031
in testcase: reaim
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:
runtime: 300s
nr_task: 5000
test: shared_memory
cpufreq_governor: performance
test-description: REAIM is an updated and improved version of AIM 7 benchmark.
test-url: https://sourceforge.net/projects/re-aim-7/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
testcase/path_params/tbox_group/run: reaim/300s-5000-shared_memory-performance/lkp-hsw-ep5
fd2b2c57ec2020ae cb6268f05df684e00607762fd8
---------------- --------------------------
634033 ± 4% 865% 6120365 reaim.jobs_per_min
126 ± 4% 865% 1224 reaim.jobs_per_min_child
672755 ± 5% 831% 6263184 reaim.max_jobs_per_min
34.14 ± 3% 11% 37.82 reaim.std_dev_percent
65.40 -6% 61.66 reaim.jti
13.53 -7% 12.60 reaim.child_utime
47.48 ± 4% -90% 4.90 reaim.parent_time
1981 ± 4% -90% 204 reaim.child_systime
12.19 -90% 1.24 reaim.std_dev_time
5160570 ± 7% 517% 31838188 reaim.time.minor_page_faults
88.17 ± 7% 474% 505.69 reaim.time.user_time
125994 ± 11% 244% 433632 reaim.time.involuntary_context_switches
4053795 ± 7% 79% 7256949 reaim.time.voluntary_context_switches
3974 -28% 2864 reaim.time.percent_of_cpu_this_job_got
12855 ± 4% -36% 8249 reaim.time.system_time
27741 ± 3% 86% 51490 vmstat.system.cs
214773 ± 5% 44% 308264 interrupts.CAL:Function_call_interrupts
fail:runs %reproduction fail:runs
| | |
:4 25% 1:4 stderr.create_shared_memory():can't_create_shared_memory,pausing
0 6e+05 551946 ±104% latency_stats.avg.call_rwsem_down_write_failed.shmctl_down.SyS_shmctl.do_syscall_64.return_from_SYSCALL_64
240707 ± 7% -2e+05 81933 latency_stats.avg.call_rwsem_down_write_failed.ipcget.SyS_shmget.entry_SYSCALL_64_fastpath
282714 ± 4% -2e+05 76327 latency_stats.avg.call_rwsem_down_write_failed.shm_close.remove_vma.do_munmap.SyS_shmdt.entry_SYSCALL_64_fastpath
313951 ± 5% -2e+05 78957 latency_stats.avg.call_rwsem_down_write_failed.do_shmat.SyS_shmat.entry_SYSCALL_64_fastpath
341015 ± 4% -3e+05 78091 latency_stats.avg.call_rwsem_down_write_failed.shmctl_down.SyS_shmctl.entry_SYSCALL_64_fastpath
0 6e+05 551946 ±104% latency_stats.max.call_rwsem_down_write_failed.shmctl_down.SyS_shmctl.do_syscall_64.return_from_SYSCALL_64
21599230 ± 3% -2e+07 3153822 ± 6% latency_stats.max.call_rwsem_down_write_failed.shmctl_down.SyS_shmctl.entry_SYSCALL_64_fastpath
21608679 ± 3% -2e+07 3152519 ± 6% latency_stats.max.call_rwsem_down_write_failed.ipcget.SyS_shmget.entry_SYSCALL_64_fastpath
21612440 ± 3% -2e+07 3153028 ± 6% latency_stats.max.call_rwsem_down_write_failed.do_shmat.SyS_shmat.entry_SYSCALL_64_fastpath
21613940 ± 3% -2e+07 3154107 ± 6% latency_stats.max.call_rwsem_down_write_failed.shm_close.remove_vma.do_munmap.SyS_shmdt.entry_SYSCALL_64_fastpath
21615866 ± 3% -2e+07 3154900 ± 6% latency_stats.max.max
3.835e+10 ± 4% 9e+10 1.254e+11 ± 7% latency_stats.sum.io_schedule.__lock_page.do_wp_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
1.672e+09 ± 22% 5e+09 6.765e+09 ± 86% latency_stats.sum.call_rwsem_down_write_failed.ipcget.SyS_semget.entry_SYSCALL_64_fastpath
2757771 ± 40% 2e+08 2.425e+08 ± 26% latency_stats.sum.io_schedule.wait_on_page_bit.__migration_entry_wait.migration_entry_wait.do_swap_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
0 6e+05 551946 ±104% latency_stats.sum.call_rwsem_down_write_failed.shmctl_down.SyS_shmctl.do_syscall_64.return_from_SYSCALL_64
24449 ± 67% 2e+05 200503 ± 8% latency_stats.sum.io_schedule.__lock_page_or_retry.filemap_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
40790 ± 10% 2e+05 191446 ± 11% latency_stats.sum.ep_poll.SyS_epoll_wait.do_syscall_64.return_from_SYSCALL_64
27684 ± 19% 1e+05 172543 ± 13% latency_stats.sum.devkmsg_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
52252 ± 4% 1e+05 189551 ± 5% latency_stats.sum.wait_woken.inotify_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
6530 ± 95% 9e+04 91966 ± 21% latency_stats.sum.io_schedule.__lock_page_killable.__lock_page_or_retry.filemap_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
5976 ± 20% 3e+04 34861 ± 4% latency_stats.sum.pipe_wait.pipe_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath
2.659e+11 ± 3% -2e+11 8.371e+10 latency_stats.sum.call_rwsem_down_write_failed.do_shmat.SyS_shmat.entry_SYSCALL_64_fastpath
5864993 ± 7% 454% 32495687 perf-stat.page-faults
5864993 ± 7% 454% 32495687 perf-stat.minor-faults
0.11 ± 3% 422% 0.60 perf-stat.branch-miss-rate%
1.501e+08 ± 7% 356% 6.847e+08 perf-stat.node-store-misses
4.444e+11 ± 6% 316% 1.849e+12 perf-stat.dTLB-stores
1.077e+08 ± 5% 314% 4.46e+08 perf-stat.node-loads
1.067e+09 ± 7% 287% 4.133e+09 perf-stat.cache-misses
6.282e+08 ± 7% 275% 2.355e+09 perf-stat.node-load-misses
1.757e+08 ± 7% 261% 6.339e+08 perf-stat.node-stores
4.56e+09 ± 6% 232% 1.516e+10 perf-stat.branch-misses
3.20 212% 9.98 perf-stat.cache-miss-rate%
71703040 ± 6% 163% 1.884e+08 perf-stat.dTLB-store-misses
2.831e+08 ± 11% 111% 5.964e+08 perf-stat.iTLB-loads
9086070 ± 7% 74% 15819624 perf-stat.context-switches
1340441 ± 8% 72% 2308738 perf-stat.cpu-migrations
2.164e+09 ± 7% 32% 2.847e+09 perf-stat.iTLB-load-misses
3.337e+10 ± 5% 24% 4.14e+10 perf-stat.cache-references
46.07 13% 51.93 perf-stat.node-store-miss-rate%
0.66 9% 0.72 perf-stat.ipc
85.35 84.08 perf-stat.node-load-miss-rate%
88.46 -7% 82.67 perf-stat.iTLB-load-miss-rate%
1.51 -8% 1.39 perf-stat.cpi
5.134e+12 ± 4% -28% 3.698e+12 perf-stat.dTLB-loads
1.972e+13 ± 4% -32% 1.342e+13 perf-stat.instructions
3.976e+12 ± 4% -36% 2.533e+12 perf-stat.branch-instructions
0.02 ± 6% -37% 0.01 perf-stat.dTLB-store-miss-rate%
2.977e+13 ± 4% -37% 1.865e+13 perf-stat.cpu-cycles
9132 ± 3% -48% 4715 perf-stat.instructions-per-iTLB-miss
0.06 ± 27% -68% 0.02 perf-stat.dTLB-load-miss-rate%
3.271e+09 ± 27% -77% 7.473e+08 perf-stat.dTLB-load-misses
reaim.parent_time
55 ++---------------------------------------------------------------------+
50 ++.*. .*.. *..|
*. *..*. *.*..*..*..*.*..*..*..*.*..*..*.*..*.. .*.*..*..*.. + *
45 ++ *. * |
40 ++ |
35 ++ |
30 ++ |
| |
25 ++ |
20 ++ |
15 ++ |
10 ++ |
| |
5 O+ O O O O O O O O O O O O O O O O O O O O O O |
0 ++---------------------------------------------------------------------+
reaim.child_systime
2200 ++-*-*-----*---------------------------------------------------------+
2000 *+ *. : *.*..*..*. .*.. .*..|
| : .*.*.. .*.. .. *..*..*.*. * *
1800 ++ *..*. *..* *..*.* |
1600 ++ |
1400 ++ |
1200 ++ |
| |
1000 ++ |
800 ++ |
600 ++ |
400 ++ |
| |
200 O+ O O O O O O O O O O O O O O O O O O O O O O |
0 ++-------------------------------------------------------------------+
reaim.jobs_per_min
7e+06 ++------------------------------------------------------------------+
O O O O O O O O O O O O O O O |
6e+06 ++ O O O O O O O O |
| |
5e+06 ++ |
| |
4e+06 ++ |
| |
3e+06 ++ |
| |
2e+06 ++ |
| |
1e+06 ++ |
*..*.*..*.*..*..*.*..*..*.*..*.*..*..*.*..*.*..*..*.*..*..*.*..*.*..*
0 ++------------------------------------------------------------------+
reaim.jobs_per_min_child
1400 ++-------------------------------------------------------------------+
O O O O O O O O O O O O O O O |
1200 ++ O O O O O O O O |
| |
1000 ++ |
| |
800 ++ |
| |
600 ++ |
| |
400 ++ |
| |
200 ++ |
*..*.*..*..*.*..*..*.*..*..*.*..*..*.*..*.*..*..*.*..*..*.*..*..*.*..*
0 ++-------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Xiaolong
View attachment "config-4.13.0-rc2-00023-gcb6268f0" of type "text/plain" (160961 bytes)
View attachment "job-script" of type "text/plain" (6767 bytes)
View attachment "job.yaml" of type "text/plain" (4438 bytes)
View attachment "reproduce" of type "text/plain" (4906 bytes)
Powered by blists - more mailing lists