lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Mon, 1 Aug 2016 13:18:20 +0800
From:	kernel test robot <xiaolong.ye@...el.com>
To:	Mel Gorman <mgorman@...hsingularity.net>
Cc:	Stephen Rothwell <sfr@...b.auug.org.au>,
	Vlastimil Babka <vbabka@...e.cz>,
	Hillf Danton <hillf.zj@...baba-inc.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Michal Hocko <mhocko@...nel.org>,
	Minchan Kim <minchan@...nel.org>,
	Rik van Riel <riel@...riel.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp] [mm, page_alloc]  f85cb5028c: netperf.Throughput_Mbps -16.5%
 regression


FYI, we noticed a -16.5% regression of netperf.Throughput_Mbps due to commit:

commit f85cb5028c273692b2ac4f7f7e73f77ad2d5a3c2 ("mm, page_alloc: remove fair zone allocation policy")
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

in testcase: netperf
on test machine: 16 threads Broadwell-DE with 8G memory
with following parameters:

	ip: ipv4
	runtime: 900s
	nr_threads: 25%
	cluster: cs-localhost
	test: TCP_STREAM
	cpufreq_governor: performance



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
  cs-localhost/gcc-6/performance/ipv4/x86_64-rhel/25%/debian-x86_64-2015-02-07.cgz/900s/lkp-bdw-de1/TCP_STREAM/netperf

commit: 
  23c17ad07f ("mm, vmscan: add classzone information to tracepoints")
  f85cb5028c ("mm, page_alloc: remove fair zone allocation policy")

23c17ad07fba9303 f85cb5028c273692b2ac4f7f7e 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           25%           1:4     kmsg.i2c_i2c-#:sendbytes:NAK_bailout
          1:4          -25%            :4     kmsg.usb#-#:can't_read_configurations,error
          1:4          -25%            :4     kmsg.usb#-#:unable_to_read_config_index#descriptor/all
         %stddev     %change         %stddev
             \          |                \  
     29019 ±  1%     -16.5%      24223 ±  2%  netperf.Throughput_Mbps
     12798 ±  6%     -17.2%      10593 ±  9%  netperf.time.involuntary_context_switches
    334.50 ±  1%     -13.4%     289.75 ±  1%  netperf.time.percent_of_cpu_this_job_got
      2941 ±  1%     -13.8%       2536 ±  1%  netperf.time.system_time
   1491248 ±  7%     +58.7%    2366463 ±  5%  netperf.time.voluntary_context_switches
      1896 ±  1%      +9.6%       2077 ±  0%  slabinfo.kmalloc-2048.active_objs
     16.91 ±  4%      +9.8%      18.57 ±  2%  perf-profile.cycles-pp.copy_user_enhanced_fast_string.tcp_sendmsg.inet_sendmsg.sock_sendmsg.SYSC_sendto
      0.83 ± 11%     +24.0%       1.03 ± 10%  perf-profile.cycles-pp.sk_stream_alloc_skb.tcp_sendmsg.inet_sendmsg.sock_sendmsg.SYSC_sendto
    238789 ± 11%     -76.8%      55338 ± 49%  vmstat.system.cs
     16823 ±  0%      -1.5%      16570 ±  0%  vmstat.system.in
 1.884e+08 ±  5%     -41.1%   1.11e+08 ± 10%  softirqs.NET_RX
    218456 ±  0%     +18.4%     258550 ±  1%  softirqs.RCU
   1105427 ±  1%     -11.8%     975484 ±  1%  softirqs.SCHED
    497245 ±  0%     -89.0%      54772 ±  1%  meminfo.Active
      8511 ±  0%    +543.5%      54765 ±  1%  meminfo.Active(anon)
    488733 ±  0%    -100.0%       6.00 ±  0%  meminfo.Active(file)
     54898 ±  1%    +790.2%     488733 ±  0%  meminfo.Inactive(file)
 2.098e+08 ± 12%     -79.0%   44033424 ± 52%  cpuidle.C1-BDW.time
 1.011e+08 ± 11%     -79.0%   21193683 ± 55%  cpuidle.C1-BDW.usage
 3.694e+08 ±  8%     +91.5%  7.074e+08 ±  5%  cpuidle.C3-BDW.time
   1104487 ±  7%     +81.0%    1999452 ±  5%  cpuidle.C3-BDW.usage
  83514559 ±  8%     -34.1%   55042921 ± 10%  cpuidle.POLL.time
   2178797 ± 11%     -79.1%     454769 ± 56%  cpuidle.POLL.usage
      2183 ±  2%    -100.0%       0.00 ± -1%  proc-vmstat.nr_alloc_batch
 3.936e+08 ±  1%     -15.7%  3.319e+08 ±  2%  proc-vmstat.numa_hit
 3.936e+08 ±  1%     -15.7%  3.319e+08 ±  2%  proc-vmstat.numa_local
 7.693e+08 ±  1%    -100.0%       0.00 ±  0%  proc-vmstat.pgalloc_dma32
 2.366e+09 ±  1%     +11.5%  2.639e+09 ±  2%  proc-vmstat.pgalloc_normal
 3.135e+09 ±  1%     -15.8%  2.639e+09 ±  2%  proc-vmstat.pgfree
     46.62 ±  0%      -3.6%      44.94 ±  0%  turbostat.%Busy
      1165 ±  0%      -3.6%       1124 ±  0%  turbostat.Avg_MHz
      1.09 ±  8%     +77.2%       1.93 ±  4%  turbostat.CPU%c3
     10.71 ±  2%     +43.7%      15.38 ±  5%  turbostat.CPU%c6
     39.17 ±  0%      -3.9%      37.66 ±  0%  turbostat.PkgWatt
     19.89 ±  4%     +20.1%      23.89 ±  2%  turbostat.RAMWatt
  1.06e+12 ±  4%     -33.3%  7.073e+11 ±  9%  perf-stat.branch-instructions
      0.40 ±  3%     +30.9%       0.52 ±  4%  perf-stat.branch-miss-rate
 4.228e+09 ±  1%     -13.0%   3.68e+09 ±  4%  perf-stat.branch-misses
 8.141e+11 ±  2%     -17.3%  6.732e+11 ±  4%  perf-stat.cache-misses
 8.141e+11 ±  2%     -17.3%  6.732e+11 ±  4%  perf-stat.cache-references
 2.156e+08 ± 11%     -76.8%   49969421 ± 49%  perf-stat.context-switches
     62562 ±  1%      -7.4%      57921 ±  0%  perf-stat.cpu-migrations
 1.651e+13 ±  1%      -3.2%  1.598e+13 ±  2%  perf-stat.cycles
      0.03 ±  4%     +36.5%       0.04 ±  6%  perf-stat.dTLB-load-miss-rate
 2.614e+12 ±  3%     -28.4%  1.871e+12 ±  5%  perf-stat.dTLB-loads
 9.526e+08 ± 25%     -37.7%  5.935e+08 ± 17%  perf-stat.dTLB-store-misses
 1.958e+12 ±  3%     -26.9%  1.432e+12 ±  4%  perf-stat.dTLB-stores
 1.169e+09 ±  9%     -52.6%  5.539e+08 ± 15%  perf-stat.iTLB-load-misses
 1.186e+13 ±  5%     -35.1%  7.702e+12 ±  9%  perf-stat.instructions
     10183 ±  3%     +37.8%      14031 ±  6%  perf-stat.instructions-per-iTLB-miss
      0.72 ±  6%     -33.0%       0.48 ±  8%  perf-stat.ipc
    246878 ± 25%     -77.2%      56283 ±141%  sched_debug.cfs_rq:/.MIN_vruntime.max
     63597 ± 28%     -74.7%      16086 ±146%  sched_debug.cfs_rq:/.MIN_vruntime.stddev
      0.77 ± 23%     -36.7%       0.48 ± 32%  sched_debug.cfs_rq:/.load_avg.min
    246878 ± 25%     -77.2%      56283 ±141%  sched_debug.cfs_rq:/.max_vruntime.max
     63597 ± 28%     -74.7%      16086 ±146%  sched_debug.cfs_rq:/.max_vruntime.stddev
    322631 ±  2%     -13.0%     280763 ±  6%  sched_debug.cfs_rq:/.min_vruntime.stddev
    320721 ± 33%     +44.3%     462865 ± 21%  sched_debug.cfs_rq:/.spread0.max
    322633 ±  2%     -13.0%     280765 ±  6%  sched_debug.cfs_rq:/.spread0.stddev
    657860 ±  3%     +12.2%     738159 ±  1%  sched_debug.cpu.avg_idle.avg
     84895 ± 24%    +239.0%     287807 ± 13%  sched_debug.cpu.avg_idle.min
    357850 ±  4%     -20.4%     284765 ±  3%  sched_debug.cpu.avg_idle.stddev
     70.30 ±  3%     +12.2%      78.85 ±  1%  sched_debug.cpu.cpu_load[1].avg
     70.58 ±  3%     +11.9%      79.00 ±  1%  sched_debug.cpu.cpu_load[2].avg
     84932 ±  1%      -9.1%      77174 ±  3%  sched_debug.cpu.nr_load_updates.stddev
   7370152 ±  5%     -75.2%    1830136 ± 61%  sched_debug.cpu.nr_switches.avg
  14180734 ±  8%     -71.7%    4014680 ± 60%  sched_debug.cpu.nr_switches.max
   1787548 ±  7%     -81.2%     335812 ± 63%  sched_debug.cpu.nr_switches.min
   4217040 ±  9%     -73.3%    1124694 ± 63%  sched_debug.cpu.nr_switches.stddev
      0.32 ± 23%     -58.5%       0.13 ± 49%  sched_debug.rt_rq:/.rt_time.max





                               netperf.Throughput_Mbps

  35000 ++------------------------------------------------------------------+
        |           .*        .*                 .*            *.           |
  30000 ++ *      **  ***.****  ***.*  *       ** :  *       **  ****  * *.**
        |  :      :                 :  :       :  :  :       :      :  :*   |
  25000 OO OO O OO:O   OO    O    O O  :       :  :  :       :      :  :    |
        |  : O    O  OO   OOO  OOO  :  ::      :  :  ::      :       : :    |
  20000 ++::      :                 :  ::      :  :  ::      :       : :    |
        | ::      :                 :  ::      :  :  ::      :       : :    |
  15000 ++: :     :                  : ::      :   : ::      :       : :    |
        | : :    :                   :: :     :    :: :     :        ::     |
  10000 ++: :    :                   :: :     :    :: :     :        ::     |
        | : :    :                   ::  :    :    ::  :    :         :     |
   5000 ++  :    :                   ::  :    :    ::  :    :         :     |
        |:  :    :                   ::  :    :    ::  :    :         :     |
      0 **--***-**-------------------**--***-**----**--****-*---------*-----+



	[*] bisect-good sample
	[O] bisect-bad  sample





Thanks,
Xiaolong

View attachment "config-4.7.0-rc7-00317-gf85cb50" of type "text/plain" (151037 bytes)

View attachment "job.yaml" of type "text/plain" (4006 bytes)

View attachment "reproduce" of type "text/plain" (428 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ