lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Fri, 14 Oct 2016 10:28:26 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>, lkp@...org
Subject: [lkp] [switch generic_file_splice_read() to use of ]  82c156f853:
 netperf.Throughput_Mbps -4.3% regression


FYI, we noticed a -4.3% regression of netperf.Throughput_Mbps due to commit:

commit 82c156f853840645604acd7c2cebcb75ed1b6652 ("switch generic_file_splice_read() to use of ->read_iter()")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

in testcase: netperf
on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
with following parameters:

	ip: ipv4
	runtime: 300s
	nr_threads: 200%
	cluster: cs-localhost
	test: TCP_SENDFILE
	cpufreq_governor: performance

Netperf is a benchmark that can be use to measure various aspect of networking performance.



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
  cs-localhost/gcc-6/performance/ipv4/x86_64-rhel-7.2/200%/debian-x86_64-2016-08-31.cgz/300s/lkp-bdw-de1/TCP_SENDFILE/netperf

commit: 
  241699cd72 ("new iov_iter flavour: pipe-backed")
  82c156f853 ("switch generic_file_splice_read() to use of ->read_iter()")

241699cd72a8489c 82c156f853840645604acd7c2c 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     11696 ±  0%      -4.3%      11194 ±  0%  netperf.Throughput_Mbps
     32745 ± 13%    +110.1%      68800 ± 40%  netperf.time.involuntary_context_switches
    919.25 ±  0%      +4.4%     959.50 ±  0%  netperf.time.percent_of_cpu_this_job_got
      2708 ±  0%      +4.6%       2831 ±  0%  netperf.time.system_time
   3331360 ±  0%      -4.8%    3169954 ±  0%  netperf.time.voluntary_context_switches
     22484 ±  2%     +11.3%      25033 ±  6%  meminfo.Active(file)
    314919 ± 16%    +132.9%     733525 ± 44%  softirqs.RCU
     32891 ±  0%      -3.3%      31793 ±  1%  vmstat.system.cs
      5620 ±  2%     +11.3%       6258 ±  6%  proc-vmstat.nr_active_file
      5620 ±  2%     +11.3%       6258 ±  6%  proc-vmstat.nr_zone_active_file
      5389 ± 94%    +199.8%      16156 ± 53%  sched_debug.cfs_rq:/.spread0.max
      7.82 ±  9%     +37.9%      10.78 ±  7%  sched_debug.cfs_rq:/.util_avg.stddev
    584356 ± 11%     -16.3%     489033 ±  4%  sched_debug.cpu.nr_switches.max
    105882 ± 12%     -19.3%      85494 ±  9%  sched_debug.cpu.nr_switches.stddev
 1.504e+12 ±  0%      +5.2%  1.582e+12 ±  0%  perf-stat.branch-instructions
      0.23 ±  0%     -27.1%       0.17 ±  0%  perf-stat.branch-miss-rate%
 3.429e+09 ±  0%     -23.3%  2.629e+09 ±  0%  perf-stat.branch-misses
  2.01e+11 ±  0%      -5.3%  1.903e+11 ±  0%  perf-stat.cache-misses
  2.01e+11 ±  0%      -5.3%  1.903e+11 ±  0%  perf-stat.cache-references
   9958921 ±  0%      -3.3%    9628936 ±  1%  perf-stat.context-switches
 3.121e+12 ±  0%      +1.5%  3.167e+12 ±  0%  perf-stat.dTLB-loads
      0.01 ± 13%     -11.7%       0.01 ±  0%  perf-stat.dTLB-store-miss-rate%
 1.828e+08 ± 13%     -11.0%  1.627e+08 ±  0%  perf-stat.dTLB-store-misses
  2.36e+08 ±  2%     -49.0%  1.204e+08 ±  5%  perf-stat.iTLB-load-misses
 1.093e+08 ± 12%     -33.6%   72525628 ± 28%  perf-stat.iTLB-loads
 8.491e+12 ±  0%      +3.5%   8.79e+12 ±  0%  perf-stat.instructions
     36006 ±  2%    +103.4%      73247 ±  6%  perf-stat.instructions-per-iTLB-miss
      0.76 ±  0%      +3.3%       0.78 ±  0%  perf-stat.ipc
      5.73 ±  0%    -100.0%       0.00 ± -1%  perf-profile.calltrace.cycles-pp.__generic_file_splice_read.generic_file_splice_read.do_splice_to.splice_direct_to_actor.do_splice_direct
      0.00 ± -1%      +Inf%       0.97 ±  3%  perf-profile.calltrace.cycles-pp.__radix_tree_lookup.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.generic_file_read_iter
      0.00 ± -1%      +Inf%       2.81 ±  1%  perf-profile.calltrace.cycles-pp.copy_page_to_iter.generic_file_read_iter.generic_file_splice_read.do_splice_to.splice_direct_to_actor
      8.45 ±  0%     +58.1%      13.36 ±  0%  perf-profile.calltrace.cycles-pp.do_splice_to.splice_direct_to_actor.do_splice_direct.do_sendfile.sys_sendfile64
      0.00 ± -1%      +Inf%       2.86 ±  1%  perf-profile.calltrace.cycles-pp.find_get_entry.pagecache_get_page.generic_file_read_iter.generic_file_splice_read.do_splice_to
      2.89 ±  2%    -100.0%       0.00 ± -1%  perf-profile.calltrace.cycles-pp.find_get_pages_contig.__generic_file_splice_read.generic_file_splice_read.do_splice_to.splice_direct_to_actor
      0.00 ± -1%      +Inf%      10.29 ±  0%  perf-profile.calltrace.cycles-pp.generic_file_read_iter.generic_file_splice_read.do_splice_to.splice_direct_to_actor.do_splice_direct
      6.65 ±  1%     +75.5%      11.68 ±  0%  perf-profile.calltrace.cycles-pp.generic_file_splice_read.do_splice_to.splice_direct_to_actor.do_splice_direct.do_sendfile
      0.00 ± -1%      +Inf%       3.47 ±  1%  perf-profile.calltrace.cycles-pp.pagecache_get_page.generic_file_read_iter.generic_file_splice_read.do_splice_to.splice_direct_to_actor
      0.00 ± -1%      +Inf%       1.15 ±  3%  perf-profile.calltrace.cycles-pp.radix_tree_lookup_slot.find_get_entry.pagecache_get_page.generic_file_read_iter.generic_file_splice_read
      2.88 ±  1%     +16.9%       3.37 ±  2%  perf-profile.children.cycles-pp.___might_sleep
      5.77 ±  0%    -100.0%       0.00 ± -1%  perf-profile.children.cycles-pp.__generic_file_splice_read
      0.00 ± -1%      +Inf%       1.07 ±  3%  perf-profile.children.cycles-pp.__radix_tree_lookup
      0.98 ±  2%     +28.6%       1.26 ±  2%  perf-profile.children.cycles-pp.atime_needs_update
      8.49 ±  0%     +57.9%      13.40 ±  0%  perf-profile.children.cycles-pp.do_splice_to
      0.00 ± -1%      +Inf%       3.00 ±  1%  perf-profile.children.cycles-pp.find_get_entry
      2.92 ±  2%    -100.0%       0.00 ± -1%  perf-profile.children.cycles-pp.find_get_pages_contig
      0.00 ± -1%      +Inf%      10.30 ±  0%  perf-profile.children.cycles-pp.generic_file_read_iter
      6.67 ±  1%     +75.9%      11.74 ±  0%  perf-profile.children.cycles-pp.generic_file_splice_read
      0.00 ± -1%      +Inf%       3.55 ±  1%  perf-profile.children.cycles-pp.pagecache_get_page
      0.00 ± -1%      +Inf%       1.28 ±  3%  perf-profile.children.cycles-pp.radix_tree_lookup_slot
      0.98 ±  3%     +38.1%       1.36 ±  2%  perf-profile.children.cycles-pp.touch_atime
      2.88 ±  1%     +16.9%       3.37 ±  2%  perf-profile.self.cycles-pp.___might_sleep
      1.94 ±  1%    -100.0%       0.00 ± -1%  perf-profile.self.cycles-pp.__generic_file_splice_read
      0.00 ± -1%      +Inf%       1.07 ±  3%  perf-profile.self.cycles-pp.__radix_tree_lookup
      0.53 ±  3%    +425.7%       2.76 ±  1%  perf-profile.self.cycles-pp.copy_page_to_iter
      0.00 ± -1%      +Inf%       1.75 ±  2%  perf-profile.self.cycles-pp.find_get_entry
      2.13 ±  2%    -100.0%       0.00 ± -1%  perf-profile.self.cycles-pp.find_get_pages_contig
      0.00 ± -1%      +Inf%       2.26 ±  0%  perf-profile.self.cycles-pp.generic_file_read_iter



                             perf-stat.branch-instructions

  1.6e+12 O+OO---OO----O-O----OO-O-OO-O-OO-O-OO-O-OO-O-OO-O-OO-OO-O-OO-O-OO-O
          *  *.*.**.*.**.*.*  **.*.**   **.*.**.*.**.*.**.*.**.**.*.**.*    |
  1.4e+12 ++ :             :  :     :   :                                   |
  1.2e+12 ++ :             :  :     :   :                                   |
          |  :             :  :     :   :                                   |
    1e+12 ++ :             : :       : :                                    |
          |: :             : :       : :                                    |
    8e+11 ++ :             : :       : :                                    |
          |::               ::       : :                                    |
    6e+11 ++:               ::       : :                                    |
    4e+11 ++:               ::       : :                                    |
          | :               :         :                                     |
    2e+11 ++:               :         :                                     |
          | :               :         :                                     |
        0 ++*--O----O-O----OO---------*-------------------------------------+


                                   perf-stat.ipc

  0.8 O+O-O--O-O----O-O----O-O-OO-O-O-OO-O-O-OO-O-O-OO-O-O-OO-O-O-OO-O-O-OO-O
      *   **.*.*.**.*.*.*  *.*.**.*   **.*.*.**.*.*.**.*.*.**.*.*.**.*.*    |
  0.7 ++  :             :  :      :   :                                     |
  0.6 ++  :             :  :      :   :                                     |
      |   :             :  :      :   :                                     |
  0.5 ++ :              : :        : :                                      |
      |: :              : :        : :                                      |
  0.4 ++ :              : :        : :                                      |
      |: :               ::        : :                                      |
  0.3 ++ :               ::        : :                                      |
  0.2 ++ :               ::        : :                                      |
      | :                :          :                                       |
  0.1 ++:                :          :                                       |
      | :                :          :                                       |
    0 ++*--O-----OO-----OO----------*---------------------------------------+


                               netperf.Throughput_Mbps

  12000 *+**-*-*-**-*-**-*-**-*-*-**-*-**-*-**-*-*-**-*----*-**-*-*-**-*----+
        O OO   O O    OO    O O O OO O OO O OO O O OO O OO O OO O O OO O OO O
  10000 ++                                                                  |
        |                                                                   |
        |                                                                   |
   8000 ++                                                                  |
        |                                                                   |
   6000 ++                                                                  |
        |                                                                   |
   4000 ++                                                                  |
        |                                                                   |
        |                                                                   |
   2000 ++                                                                  |
        |                                                                   |
      0 ++---O----O-O----O-O------------------------------------------------+


  
	[*] bisect-good sample
	[O] bisect-bad  sample





Thanks,
Xiaolong

View attachment "config-4.8.0-rc8-00009-g82c156f" of type "text/plain" (152576 bytes)

View attachment "job-script" of type "text/plain" (6922 bytes)

View attachment "job.yaml" of type "text/plain" (4429 bytes)

View attachment "reproduce" of type "text/plain" (1913 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ