lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 10 Oct 2018 09:51:43 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Yuchung Cheng <ycheng@...gle.com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Wei Wang <weiwan@...gle.com>,
        Neal Cardwell <ncardwell@...gle.com>,
        Eric Dumazet <edumazet@...gle.com>,
        Soheil Hassas Yeganeh <soheil@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org,
        lkp@...org
Subject: [LKP] [tcp]  a337531b94:  netperf.Throughput_Mbps -6.1% regression

Greeting,

FYI, we noticed a -6.1% regression of netperf.Throughput_Mbps due to commit:


commit: a337531b942bd8a03e7052444d7e36972aac2d92 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")
https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git master

in testcase: netperf
on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
with following parameters:

	ip: ipv4
	runtime: 900s
	nr_threads: 200%
	cluster: cs-localhost
	test: TCP_STREAM
	ucode: 0x7000013
	cpufreq_governor: performance

test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
  cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2018-04-03.cgz/900s/lkp-bdw-de1/TCP_STREAM/netperf/0x7000013

commit: 
  3ff6cde846 ("hns3: Another build fix.")
  a337531b94 ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB")

3ff6cde846857d45 a337531b942bd8a03e7052444d 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           50%           2:4     dmesg.WARNING:at#for_ip_interrupt_entry/0x
         %stddev     %change         %stddev
             \          |                \  
      2497            -6.1%       2345        netperf.Throughput_Mbps
     79924            -6.1%      75061        netperf.Throughput_total_Mbps
    186513           +11.3%     207590        netperf.time.involuntary_context_switches
 5.488e+08            -6.1%  5.154e+08        netperf.workload
      1172 ± 34%     -37.6%     731.75 ±  5%  cpuidle.C1E.usage
      1137 ± 34%     -40.0%     682.25 ±  8%  turbostat.C1E
      2775 ± 11%     +17.5%       3261 ±  9%  sched_debug.cpu.nr_switches.stddev
      0.01 ± 17%     +28.2%       0.01 ± 10%  sched_debug.rt_rq:/.rt_time.avg
      0.14 ± 17%     +28.2%       0.18 ± 10%  sched_debug.rt_rq:/.rt_time.max
      0.03 ± 17%     +28.2%       0.04 ± 10%  sched_debug.rt_rq:/.rt_time.stddev
     66336            +0.9%      66948        proc-vmstat.nr_anon_pages
 2.755e+08            -6.1%  2.588e+08        proc-vmstat.numa_hit
 2.755e+08            -6.1%  2.588e+08        proc-vmstat.numa_local
 2.197e+09            -6.1%  2.064e+09        proc-vmstat.pgalloc_normal
 2.197e+09            -6.1%  2.064e+09        proc-vmstat.pgfree
 5.903e+11            -7.9%  5.438e+11        perf-stat.branch-instructions
      2.68            -0.0        2.64        perf-stat.branch-miss-rate%
 1.582e+10            -9.2%  1.436e+10        perf-stat.branch-misses
  6.26e+11            -4.7%  5.964e+11        perf-stat.cache-misses
  6.26e+11            -4.7%  5.964e+11        perf-stat.cache-references
     11.69            +8.6%      12.69        perf-stat.cpi
    123723            +2.1%     126291        perf-stat.cpu-migrations
      0.09 ±  2%      +0.0        0.09        perf-stat.dTLB-load-miss-rate%
 1.475e+12            -7.1%   1.37e+12        perf-stat.dTLB-loads
 1.094e+12            -6.9%  1.018e+12        perf-stat.dTLB-stores
 2.912e+08 ±  5%     -13.0%  2.533e+08        perf-stat.iTLB-loads
 3.019e+12            -7.9%  2.781e+12        perf-stat.instructions
      0.09            -7.9%       0.08        perf-stat.ipc
      5500            -1.9%       5394        perf-stat.path-length
      0.53 ±  2%      -0.2        0.38 ± 57%  perf-profile.calltrace.cycles-pp.ip_output.__ip_queue_xmit.__tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
      0.63 ±  2%      -0.1        0.58 ±  4%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
      0.73 ±  3%      +0.1        0.78 ±  2%  perf-profile.calltrace.cycles-pp.tcp_clean_rtx_queue.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
      0.96            +0.1        1.03        perf-profile.calltrace.cycles-pp.tcp_ack.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish
     98.02            +0.1       98.13        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     97.88            +0.1       98.00        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.70 ±  3%      -0.1        0.64 ±  4%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.26 ±  5%      -0.0        0.21 ±  6%  perf-profile.children.cycles-pp._raw_spin_lock_bh
      0.28 ±  5%      -0.0        0.24 ±  6%  perf-profile.children.cycles-pp.lock_sock_nested
      0.46 ±  4%      -0.0        0.43 ±  2%  perf-profile.children.cycles-pp.nf_hook_slow
      0.21 ±  8%      -0.0        0.18 ±  5%  perf-profile.children.cycles-pp.tcp_rcv_space_adjust
      0.08 ±  5%      -0.0        0.06        perf-profile.children.cycles-pp.entry_SYSCALL_64_stage2
      0.08 ±  6%      -0.0        0.06 ±  6%  perf-profile.children.cycles-pp.ip_finish_output
      0.17 ±  6%      +0.0        0.20 ±  5%  perf-profile.children.cycles-pp.tcp_event_new_data_sent
      0.24 ±  4%      +0.0        0.27 ±  2%  perf-profile.children.cycles-pp.mod_timer
      0.15 ±  2%      +0.0        0.18 ±  2%  perf-profile.children.cycles-pp.__might_sleep
      0.80 ±  3%      +0.0        0.84 ±  2%  perf-profile.children.cycles-pp.tcp_clean_rtx_queue
      0.30 ±  3%      +0.1        0.36 ±  4%  perf-profile.children.cycles-pp.__might_fault
      1.61 ±  4%      +0.1        1.69        perf-profile.children.cycles-pp.__release_sock
      1.06 ±  2%      +0.1        1.14        perf-profile.children.cycles-pp.tcp_ack
     98.24            +0.1       98.36        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     98.09            +0.1       98.23        perf-profile.children.cycles-pp.do_syscall_64
     70.28            +0.6       70.86        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      1.56            -0.1        1.48 ±  3%  perf-profile.self.cycles-pp.copy_page_to_iter
      0.70 ±  3%      -0.1        0.64 ±  4%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      1.37 ±  2%      -0.1        1.32 ±  2%  perf-profile.self.cycles-pp.__free_pages_ok
      0.55 ±  3%      -0.0        0.50 ±  3%  perf-profile.self.cycles-pp.__alloc_skb
      0.44 ±  3%      -0.0        0.40 ±  5%  perf-profile.self.cycles-pp.tcp_recvmsg
      0.16 ±  9%      -0.0        0.14 ±  5%  perf-profile.self.cycles-pp.sock_has_perm
      0.08 ±  6%      -0.0        0.06        perf-profile.self.cycles-pp.entry_SYSCALL_64_stage2
      0.10 ±  4%      +0.0        0.12 ±  6%  perf-profile.self.cycles-pp.tcp_clean_rtx_queue
      0.14 ±  6%      +0.0        0.17 ±  4%  perf-profile.self.cycles-pp.__might_sleep
     69.25            +0.5       69.77        perf-profile.self.cycles-pp.copy_user_enhanced_fast_string


                                                                                
                              netperf.Throughput_Mbps                           
                                                                                
  3000 +-+------------------------------------------------------------------+   
       |                                                                    |   
  2500 +-+..+.+..+.+..+.+..+.+..+.+..+.+..+.+.+..+.+..+.+..+.+..+.+..+.+..+.|   
       O O  O O  O O  O O  O O  O O  O O  O O O  O O  O O  O O  O O         |   
       | :                                                                  |   
  2000 +-+                                                                  |   
       |:                                                                   |   
  1500 +-+                                                                  |   
       |:                                                                   |   
  1000 +-+                                                                  |   
       |:                                                                   |   
       |:                                                                   |   
   500 +-+                                                                  |   
       |                                                                    |   
     0 +-+------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                            netperf.Throughput_total_Mbps                       
                                                                                
  90000 +-+-----------------------------------------------------------------+   
        |                                                                   |   
  80000 O-O..O.O..O.O..O.O.O..O.O..O.O..O.O.O..O.O..O.O..O.O.O..O.O..+.+..+.|   
  70000 +-+                                                                 |   
        | :                                                                 |   
  60000 +-+                                                                 |   
  50000 +-+                                                                 |   
        |:                                                                  |   
  40000 +-+                                                                 |   
  30000 +-+                                                                 |   
        |:                                                                  |   
  20000 +-+                                                                 |   
  10000 +-+                                                                 |   
        |                                                                   |   
      0 +-+-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                  netperf.workload                              
                                                                                
  6e+08 +-+-----------------------------------------------------------------+   
        | +..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.+.+..+.+..+.+..+.|   
  5e+08 O-O  O O  O O  O O O  O O  O O  O O O  O O  O O  O O O  O O         |   
        | :                                                                 |   
        | :                                                                 |   
  4e+08 +-+                                                                 |   
        |:                                                                  |   
  3e+08 +-+                                                                 |   
        |:                                                                  |   
  2e+08 +-+                                                                 |   
        |:                                                                  |   
        |                                                                   |   
  1e+08 +-+                                                                 |   
        |                                                                   |   
      0 +-+-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen

View attachment "config-4.19.0-rc5-00886-ga337531" of type "text/plain" (167752 bytes)

View attachment "job-script" of type "text/plain" (7561 bytes)

View attachment "job.yaml" of type "text/plain" (5116 bytes)

View attachment "reproduce" of type "text/plain" (2005 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ