[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Wed, 10 Oct 2018 09:51:43 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Yuchung Cheng <ycheng@...gle.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Wei Wang <weiwan@...gle.com>,
Neal Cardwell <ncardwell@...gle.com>,
Eric Dumazet <edumazet@...gle.com>,
Soheil Hassas Yeganeh <soheil@...gle.com>,
LKML <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org,
lkp@...org
Subject: [LKP] [tcp] a337531b94: netperf.Throughput_Mbps -6.1% regression
Greeting,
FYI, we noticed a -6.1% regression of netperf.Throughput_Mbps due to commit:
commit: a337531b942bd8a03e7052444d7e36972aac2d92 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git master
in testcase: netperf
on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
with following parameters:
ip: ipv4
runtime: 900s
nr_threads: 200%
cluster: cs-localhost
test: TCP_STREAM
ucode: 0x7000013
cpufreq_governor: performance
test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/900s/lkp-bdw-de1/TCP_STREAM/netperf/0x7000013
commit:
3ff6cde846 ("hns3: Another build fix.")
a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
3ff6cde846857d45 a337531b942bd8a03e7052444d
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
:4 50% 2:4 dmesg.WARNING:at#for_ip_interrupt_entry/0x
%stddev %change %stddev
\ | \
2497 -6.1% 2345 netperf.Throughput_Mbps
79924 -6.1% 75061 netperf.Throughput_total_Mbps
186513 +11.3% 207590 netperf.time.involuntary_context_switches
5.488e+08 -6.1% 5.154e+08 netperf.workload
1172 ± 34% -37.6% 731.75 ± 5% cpuidle.C1E.usage
1137 ± 34% -40.0% 682.25 ± 8% turbostat.C1E
2775 ± 11% +17.5% 3261 ± 9% sched_debug.cpu.nr_switches.stddev
0.01 ± 17% +28.2% 0.01 ± 10% sched_debug.rt_rq:/.rt_time.avg
0.14 ± 17% +28.2% 0.18 ± 10% sched_debug.rt_rq:/.rt_time.max
0.03 ± 17% +28.2% 0.04 ± 10% sched_debug.rt_rq:/.rt_time.stddev
66336 +0.9% 66948 proc-vmstat.nr_anon_pages
2.755e+08 -6.1% 2.588e+08 proc-vmstat.numa_hit
2.755e+08 -6.1% 2.588e+08 proc-vmstat.numa_local
2.197e+09 -6.1% 2.064e+09 proc-vmstat.pgalloc_normal
2.197e+09 -6.1% 2.064e+09 proc-vmstat.pgfree
5.903e+11 -7.9% 5.438e+11 perf-stat.branch-instructions
2.68 -0.0 2.64 perf-stat.branch-miss-rate%
1.582e+10 -9.2% 1.436e+10 perf-stat.branch-misses
6.26e+11 -4.7% 5.964e+11 perf-stat.cache-misses
6.26e+11 -4.7% 5.964e+11 perf-stat.cache-references
11.69 +8.6% 12.69 perf-stat.cpi
123723 +2.1% 126291 perf-stat.cpu-migrations
0.09 ± 2% +0.0 0.09 perf-stat.dTLB-load-miss-rate%
1.475e+12 -7.1% 1.37e+12 perf-stat.dTLB-loads
1.094e+12 -6.9% 1.018e+12 perf-stat.dTLB-stores
2.912e+08 ± 5% -13.0% 2.533e+08 perf-stat.iTLB-loads
3.019e+12 -7.9% 2.781e+12 perf-stat.instructions
0.09 -7.9% 0.08 perf-stat.ipc
5500 -1.9% 5394 perf-stat.path-length
0.53 ± 2% -0.2 0.38 ± 57% perf-profile.calltrace.cycles-pp.ip_output.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
0.63 ± 2% -0.1 0.58 ± 4% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
0.73 ± 3% +0.1 0.78 ± 2% perf-profile.calltrace.cycles-pp.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
0.96 +0.1 1.03 perf-profile.calltrace.cycles-pp.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish
98.02 +0.1 98.13 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
97.88 +0.1 98.00 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.70 ± 3% -0.1 0.64 ± 4% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.26 ± 5% -0.0 0.21 ± 6% perf-profile.children.cycles-pp._raw_spin_lock_bh
0.28 ± 5% -0.0 0.24 ± 6% perf-profile.children.cycles-pp.lock_sock_nested
0.46 ± 4% -0.0 0.43 ± 2% perf-profile.children.cycles-pp.nf_hook_slow
0.21 ± 8% -0.0 0.18 ± 5% perf-profile.children.cycles-pp.tcp_rcv_space_adjust
0.08 ± 5% -0.0 0.06 perf-profile.children.cycles-pp.entry_SYSCALL_64_stage2
0.08 ± 6% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.ip_finish_output
0.17 ± 6% +0.0 0.20 ± 5% perf-profile.children.cycles-pp.tcp_event_new_data_sent
0.24 ± 4% +0.0 0.27 ± 2% perf-profile.children.cycles-pp.mod_timer
0.15 ± 2% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.__might_sleep
0.80 ± 3% +0.0 0.84 ± 2% perf-profile.children.cycles-pp.tcp_clean_rtx_queue
0.30 ± 3% +0.1 0.36 ± 4% perf-profile.children.cycles-pp.__might_fault
1.61 ± 4% +0.1 1.69 perf-profile.children.cycles-pp.__release_sock
1.06 ± 2% +0.1 1.14 perf-profile.children.cycles-pp.tcp_ack
98.24 +0.1 98.36 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
98.09 +0.1 98.23 perf-profile.children.cycles-pp.do_syscall_64
70.28 +0.6 70.86 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
1.56 -0.1 1.48 ± 3% perf-profile.self.cycles-pp.copy_page_to_iter
0.70 ± 3% -0.1 0.64 ± 4% perf-profile.self.cycles-pp.syscall_return_via_sysret
1.37 ± 2% -0.1 1.32 ± 2% perf-profile.self.cycles-pp.__free_pages_ok
0.55 ± 3% -0.0 0.50 ± 3% perf-profile.self.cycles-pp.__alloc_skb
0.44 ± 3% -0.0 0.40 ± 5% perf-profile.self.cycles-pp.tcp_recvmsg
0.16 ± 9% -0.0 0.14 ± 5% perf-profile.self.cycles-pp.sock_has_perm
0.08 ± 6% -0.0 0.06 perf-profile.self.cycles-pp.entry_SYSCALL_64_stage2
0.10 ± 4% +0.0 0.12 ± 6% perf-profile.self.cycles-pp.tcp_clean_rtx_queue
0.14 ± 6% +0.0 0.17 ± 4% perf-profile.self.cycles-pp.__might_sleep
69.25 +0.5 69.77 perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
netperf.Throughput_Mbps
3000 +-+------------------------------------------------------------------+
| |
2500 +-+..+.+..+.+..+.+..+.+..+.+..+.+..+.+.+..+.+..+.+..+.+..+.+..+.+..+.|
O O O O O O O O O O O O O O O O O O O O O O O O O |
| : |
2000 +-+ |
|: |
1500 +-+ |
|: |
1000 +-+ |
|: |
|: |
500 +-+ |
| |
0 +-+------------------------------------------------------------------+
netperf.Throughput_total_Mbps
90000 +-+-----------------------------------------------------------------+
| |
80000 O-O..O.O..O.O..O.O.O..O.O..O.O..O.O.O..O.O..O.O..O.O.O..O.O..+.+..+.|
70000 +-+ |
| : |
60000 +-+ |
50000 +-+ |
|: |
40000 +-+ |
30000 +-+ |
|: |
20000 +-+ |
10000 +-+ |
| |
0 +-+-----------------------------------------------------------------+
netperf.workload
6e+08 +-+-----------------------------------------------------------------+
| +..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.|
5e+08 O-O O O O O O O O O O O O O O O O O O O O O O O O |
| : |
| : |
4e+08 +-+ |
|: |
3e+08 +-+ |
|: |
2e+08 +-+ |
|: |
| |
1e+08 +-+ |
| |
0 +-+-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-4.19.0-rc5-00886-ga337531" of type "text/plain" (167752 bytes)
View attachment "job-script" of type "text/plain" (7561 bytes)
View attachment "job.yaml" of type "text/plain" (5116 bytes)
View attachment "reproduce" of type "text/plain" (2005 bytes)
Powered by blists - more mailing lists