[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20161019015513.GB343@yexl-desktop>
Date: Wed, 19 Oct 2016 09:55:13 +0800
From: kernel test robot <xiaolong.ye@...el.com>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: LKML <linux-kernel@...r.kernel.org>,
Stephen Rothwell <sfr@...b.auug.org.au>, lkp@...org
Subject: [lkp] [switch generic_file_splice_read() to use of ] d90e6d8861:
netperf.Throughput_Mbps -4.5% regression
FYI, we noticed a -4.5% regression of netperf.Throughput_Mbps due to commit:
commit d90e6d886195df0be797daf673e25980b8f4ef6f ("switch generic_file_splice_read() to use of ->read_iter()")
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
in testcase: netperf
on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
with following parameters:
ip: ipv4
runtime: 300s
nr_threads: 200%
cluster: cs-localhost
test: TCP_SENDFILE
cpufreq_governor: performance
Netperf is a benchmark that can be use to measure various aspect of networking performance.
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
cs-localhost/gcc-6/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2016-08-31.cgz/300s/lkp-bdw-de1/TCP_SENDFILE/netperf
commit:
b97c7fd842 ("new iov_iter flavour: pipe-backed")
d90e6d8861 ("switch generic_file_splice_read() to use of ->read_iter()")
b97c7fd84246932c d90e6d886195df0be797daf673
---------------- --------------------------
%stddev %change %stddev
\ | \
11725 ± 0% -4.5% 11200 ± 0% netperf.Throughput_Mbps
920.25 ± 0% +3.9% 956.50 ± 0% netperf.time.percent_of_cpu_this_job_got
2710 ± 0% +4.1% 2822 ± 0% netperf.time.system_time
3336711 ± 0% -4.6% 3183680 ± 0% netperf.time.voluntary_context_switches
141835 ± 14% -21.8% 110861 ± 9% cpuidle.C1E-BDW.time
3877 ± 6% +9.4% 4241 ± 2% slabinfo.kmalloc-256.active_objs
32763 ± 0% -2.8% 31859 ± 1% vmstat.system.cs
17.39 ± 3% +569.1% 116.38 ±129% sched_debug.cfs_rq:/.load_avg.stddev
2.12 ± 9% -10.9% 1.89 ± 9% sched_debug.rt_rq:/.rt_time.max
0.51 ± 9% -11.1% 0.46 ± 9% sched_debug.rt_rq:/.rt_time.stddev
1.508e+12 ± 0% +4.9% 1.582e+12 ± 0% perf-stat.branch-instructions
0.22 ± 0% -24.1% 0.17 ± 0% perf-stat.branch-miss-rate%
3.292e+09 ± 0% -20.4% 2.621e+09 ± 0% perf-stat.branch-misses
1.989e+11 ± 0% -4.7% 1.895e+11 ± 0% perf-stat.cache-misses
1.989e+11 ± 0% -4.7% 1.895e+11 ± 0% perf-stat.cache-references
9922182 ± 0% -2.8% 9648609 ± 1% perf-stat.context-switches
0.02 ± 86% -52.0% 0.01 ± 0% perf-stat.dTLB-store-miss-rate%
3.363e+08 ± 86% -52.0% 1.614e+08 ± 0% perf-stat.dTLB-store-misses
80.07 ± 6% -21.8% 62.60 ± 10% perf-stat.iTLB-load-miss-rate%
2.065e+08 ± 11% -44.2% 1.152e+08 ± 2% perf-stat.iTLB-load-misses
8.509e+12 ± 0% +2.9% 8.758e+12 ± 0% perf-stat.instructions
41814 ± 12% +82.0% 76096 ± 3% perf-stat.instructions-per-iTLB-miss
0.76 ± 0% +2.8% 0.78 ± 0% perf-stat.ipc
5.75 ± 1% -100.0% 0.00 ± -1% perf-profile.calltrace.cycles-pp.__generic_file_splice_read.generic_file_splice_read.do_splice_to.splice_direct_to_actor.do_splice_direct
0.00 ± -1% +Inf% 2.12 ± 1% perf-profile.calltrace.cycles-pp.copy_page_to_iter.generic_file_read_iter.generic_file_splice_read.do_splice_to.splice_direct_to_actor
8.49 ± 1% +54.9% 13.16 ± 0% perf-profile.calltrace.cycles-pp.do_splice_to.splice_direct_to_actor.do_splice_direct.do_sendfile.sys_sendfile64
0.00 ± -1% +Inf% 2.83 ± 2% perf-profile.calltrace.cycles-pp.find_get_entry.pagecache_get_page.generic_file_read_iter.generic_file_splice_read.do_splice_to
2.91 ± 1% -100.0% 0.00 ± -1% perf-profile.calltrace.cycles-pp.find_get_pages_contig.__generic_file_splice_read.generic_file_splice_read.do_splice_to.splice_direct_to_actor
0.00 ± -1% +Inf% 10.05 ± 0% perf-profile.calltrace.cycles-pp.generic_file_read_iter.generic_file_splice_read.do_splice_to.splice_direct_to_actor.do_splice_direct
6.68 ± 1% +71.1% 11.43 ± 0% perf-profile.calltrace.cycles-pp.generic_file_splice_read.do_splice_to.splice_direct_to_actor.do_splice_direct.do_sendfile
0.00 ± -1% +Inf% 3.46 ± 1% perf-profile.calltrace.cycles-pp.pagecache_get_page.generic_file_read_iter.generic_file_splice_read.do_splice_to.splice_direct_to_actor
0.00 ± -1% +Inf% 1.13 ± 3% perf-profile.calltrace.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.generic_file_read_iter.generic_file_splice_read
2.94 ± 1% +14.1% 3.35 ± 1% perf-profile.children.cycles-pp.___might_sleep
5.77 ± 1% -100.0% 0.00 ± -1% perf-profile.children.cycles-pp.__generic_file_splice_read
0.00 ± -1% +Inf% 1.04 ± 3% perf-profile.children.cycles-pp.__radix_tree_lookup
0.97 ± 2% +32.8% 1.30 ± 1% perf-profile.children.cycles-pp.atime_needs_update
8.54 ± 1% +54.5% 13.20 ± 0% perf-profile.children.cycles-pp.do_splice_to
0.00 ± -1% +Inf% 2.97 ± 2% perf-profile.children.cycles-pp.find_get_entry
2.94 ± 1% -100.0% 0.00 ± -1% perf-profile.children.cycles-pp.find_get_pages_contig
0.00 ± -1% +Inf% 10.07 ± 0% perf-profile.children.cycles-pp.generic_file_read_iter
6.71 ± 1% +71.5% 11.50 ± 0% perf-profile.children.cycles-pp.generic_file_splice_read
0.00 ± -1% +Inf% 3.54 ± 1% perf-profile.children.cycles-pp.pagecache_get_page
0.00 ± -1% +Inf% 1.28 ± 2% perf-profile.children.cycles-pp.radix_tree_lookup_slot
0.98 ± 5% +42.9% 1.41 ± 0% perf-profile.children.cycles-pp.touch_atime
2.94 ± 1% +14.1% 3.35 ± 1% perf-profile.self.cycles-pp.___might_sleep
1.97 ± 1% -100.0% 0.00 ± -1% perf-profile.self.cycles-pp.__generic_file_splice_read
0.00 ± -1% +Inf% 1.04 ± 3% perf-profile.self.cycles-pp.__radix_tree_lookup
0.50 ± 4% +430.8% 2.67 ± 1% perf-profile.self.cycles-pp.copy_page_to_iter
0.00 ± -1% +Inf% 1.74 ± 1% perf-profile.self.cycles-pp.find_get_entry
2.16 ± 2% -100.0% 0.00 ± -1% perf-profile.self.cycles-pp.find_get_pages_contig
0.00 ± -1% +Inf% 2.29 ± 1% perf-profile.self.cycles-pp.generic_file_read_iter
netperf.time.system_time
2900 ++-------------------------------------------------------------------+
O OO O OO O OO O O OO O OO O OO O OO O O O |
2800 ++ O O OO |
| *. *.*.*.**.*.**.*.* * *. |
2700 *+**.*.* *.* *.*.**.*.**.*.*: * *.**.*
| : : |
2600 ++ : : |
| : : |
2500 ++ : : |
| : : |
2400 ++ : : |
| : * * : |
2300 ++ :+ *. + *.: |
| * *.* * |
2200 ++-------------------------------------------------------------------+
netperf.time.percent_of_cpu_this_job_got
1000 ++-------------------------------------------------------------------+
| |
O OO O OO O OO O O OO O OO O OO O OO O OO O OO O |
950 ++ |
*.**.*.**.*.**.*.*.**.*.**.*.**.*.**.*.**.*.** **. *.*
| : : *.* |
900 ++ : : |
| : : |
850 ++ : : |
| : : |
| : : |
800 ++ :.* .* : |
| * *. .* *.: |
| * * |
750 ++-------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Thanks,
Xiaolong
View attachment "config-4.8.0-rc8-00009-gd90e6d8" of type "text/plain" (152576 bytes)
View attachment "job-script" of type "text/plain" (6885 bytes)
View attachment "job.yaml" of type "text/plain" (4390 bytes)
View attachment "reproduce" of type "text/plain" (1913 bytes)
Powered by blists - more mailing lists