lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 2 Jun 2016 14:45:07 +0800
From:	kernel test robot <xiaolong.ye@...el.com>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	David Rientjes <rientjes@...gle.com>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp] [mm] 795ae7a0de: pixz.throughput -9.1% regression



FYI, we noticed pixz.throughput -9.1% regression due to commit:

commit 795ae7a0de6b834a0cc202aa55c190ef81496665 ("mm: scale kswapd watermarks in proportion to memory")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

in testcase: pixz
on test machine: ivb43: 48 threads Ivytown Ivy Bridge-EP with 64G memory with following parameters: cpufreq_governor=performance/nr_threads=100%

In addition to that, the commit also has significant impact on the following tests:

will-it-scale:	will-it-scale.per_process_ops -8.5% regression	on test machine - ivb42: 48 threads Ivytown Ivy Bridge-EP with 64G memory
with test parameters: cpufreq_governor=performance/test=page_fault1



Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/testcase:
  gcc-4.9/performance/x86_64-rhel/100%/debian-x86_64-2015-02-07.cgz/ivb43/pixz

commit: 
  3ed3a4f0ddffece942bb2661924d87be4ce63cb7
  795ae7a0de6b834a0cc202aa55c190ef81496665

3ed3a4f0ddffece9 795ae7a0de6b834a0cc202aa55
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  78505362 ±  0%      -9.1%   71324131 ±  0%  pixz.throughput
      4530 ±  0%      +1.0%       4575 ±  0%  pixz.time.percent_of_cpu_this_job_got
     14911 ±  0%      +2.3%      15251 ±  0%  pixz.time.user_time
   6586930 ±  0%      -7.5%    6093751 ±  1%  pixz.time.voluntary_context_switches
     49869 ±  1%      -9.0%      45401 ±  0%  vmstat.system.cs
     26406 ±  4%      -9.4%      23922 ±  5%  numa-meminfo.node0.SReclaimable
      4803 ± 85%     -87.0%     625.25 ± 16%  numa-meminfo.node1.Inactive(anon)
    946.75 ±  3%    +775.4%       8288 ±  1%  proc-vmstat.nr_alloc_batch
   2403080 ±  2%     -58.4%     999765 ±  0%  proc-vmstat.pgalloc_dma32
    651.75 ±  5%     +12.0%     730.25 ±  6%  sched_debug.cfs_rq:/.util_avg.min
     81.92 ± 23%     -31.2%      56.34 ± 11%  sched_debug.cfs_rq:/.util_avg.stddev
    -16.00 ±-13%     -23.2%     -12.29 ± -9%  sched_debug.cpu.nr_uninterruptible.min
    193807 ± 15%     -32.7%     130369 ± 10%  cpuidle.C1-IVT.usage
 8.866e+08 ±  2%     -15.4%  7.498e+08 ±  5%  cpuidle.C6-IVT.time
     93283 ±  0%     -13.2%      80986 ±  2%  cpuidle.C6-IVT.usage
    771388 ±  9%     -38.5%     474559 ±  6%  cpuidle.POLL.usage
    454.25 ±  2%    +772.9%       3965 ±  2%  numa-vmstat.node0.nr_alloc_batch
      6600 ±  4%      -9.4%       5980 ±  5%  numa-vmstat.node0.nr_slab_reclaimable
    563.00 ±  2%    +658.1%       4268 ±  2%  numa-vmstat.node1.nr_alloc_batch
      1201 ± 85%     -87.0%     156.00 ± 16%  numa-vmstat.node1.nr_inactive_anon
    792.00 ± 11%     +25.0%     990.00 ± 11%  slabinfo.blkdev_requests.active_objs
    792.00 ± 11%     +25.0%     990.00 ± 11%  slabinfo.blkdev_requests.num_objs
    507.00 ±  9%     +22.8%     622.75 ±  8%  slabinfo.file_lock_cache.active_objs
    507.00 ±  9%     +22.8%     622.75 ±  8%  slabinfo.file_lock_cache.num_objs
     94.35 ±  0%      +1.0%      95.28 ±  0%  turbostat.%Busy
      2824 ±  0%      +1.0%       2852 ±  0%  turbostat.Avg_MHz
      3.57 ±  3%     -21.1%       2.82 ±  1%  turbostat.CPU%c1
     22.36 ±  0%      -8.2%      20.52 ±  0%  turbostat.RAMWatt
      0.54 ± 17%     +89.8%       1.02 ± 25%  perf-profile.cycles-pp.do_huge_pmd_anonymous_page
      0.40 ± 65%    +140.3%       0.96 ± 31%  perf-profile.cycles-pp.mga_imageblit
      2.49 ± 40%     -63.2%       0.92 ± 94%  perf-profile.cycles-pp.poll_idle
      1.11 ± 61%     -95.7%       0.05 ±149%  perf-profile.cycles-pp.rest_init
      1.11 ± 61%     -95.7%       0.05 ±149%  perf-profile.cycles-pp.start_kernel
      1.11 ± 61%     -95.7%       0.05 ±149%  perf-profile.cycles-pp.x86_64_start_kernel
      1.11 ± 61%     -95.7%       0.05 ±149%  perf-profile.cycles-pp.x86_64_start_reservations



                                    pixz.throughput

    8e+07 ++----------------------------------------------------------------+
  7.9e+07 ++  *. .*.               *. .*.      *. .*.                       |
          *. +  *   *.*.*..*.*.*. +  *   *.*. +  *   *.*.*..*.*.*.*         |
  7.8e+07 ++*                    *           *                              |
  7.7e+07 ++                                                                |
          |                                                                 |
  7.6e+07 ++                                                                |
  7.5e+07 ++                                                                |
  7.4e+07 ++                                                                |
          |                                                                 |
  7.3e+07 ++                                                                |
  7.2e+07 ++          O                                                     |
          O O O O O O   O  O O O O O O O O O O O O O O   O  O O O O O O O O |
  7.1e+07 ++                                           O                    O
    7e+07 ++----------------------------------------------------------------+


	[*] bisect-good sample
	[O] bisect-bad  sample


***************************************************************************************************
ivb42: 48 threads Ivytown Ivy Bridge-EP with 64G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
  gcc-4.9/performance/x86_64-rhel/debian-x86_64-2015-02-07.cgz/ivb42/page_fault1/will-it-scale

commit: 
  3ed3a4f0ddffece942bb2661924d87be4ce63cb7
  795ae7a0de6b834a0cc202aa55c190ef81496665

3ed3a4f0ddffece9 795ae7a0de6b834a0cc202aa55 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    442409 ±  0%      -8.5%     404670 ±  0%  will-it-scale.per_process_ops
    397397 ±  0%      -6.2%     372741 ±  0%  will-it-scale.per_thread_ops
      0.11 ±  1%     -15.1%       0.10 ±  0%  will-it-scale.scalability
      9933 ± 10%     +17.8%      11696 ±  4%  will-it-scale.time.involuntary_context_switches
   5158470 ±  3%      +5.4%    5438873 ±  0%  will-it-scale.time.maximum_resident_set_size
  10701739 ±  0%     -11.6%    9456315 ±  0%  will-it-scale.time.minor_page_faults
    825.00 ±  0%      +7.8%     889.75 ±  0%  will-it-scale.time.percent_of_cpu_this_job_got
      2484 ±  0%      +7.8%       2678 ±  0%  will-it-scale.time.system_time
     81.98 ±  0%      +8.7%      89.08 ±  0%  will-it-scale.time.user_time
    848972 ±  1%     -13.3%     735967 ±  0%  will-it-scale.time.voluntary_context_switches
  19395253 ±  0%     -20.0%   15511908 ±  0%  numa-numastat.node0.local_node
  19400671 ±  0%     -20.0%   15518877 ±  0%  numa-numastat.node0.numa_hit
      7954 ±  2%      -7.9%       7326 ±  4%  vmstat.system.cs
     21796 ±  0%      +3.2%      22492 ±  0%  vmstat.system.in
     15.92 ± 27%    +307.9%      64.92 ± 76%  sched_debug.cfs_rq:/.util_avg.min
    186.67 ± 37%     -47.7%      97.58 ± 20%  sched_debug.cpu.load.max
     39.14 ± 22%     -33.7%      25.95 ±  8%  sched_debug.cpu.load.stddev
      9933 ± 10%     +17.8%      11696 ±  4%  time.involuntary_context_switches
  10701739 ±  0%     -11.6%    9456315 ±  0%  time.minor_page_faults
    848972 ±  1%     -13.3%     735967 ±  0%  time.voluntary_context_switches
   4654910 ±  7%     -33.3%    3102519 ±  7%  cpuidle.C3-IVT.time
      9253 ± 10%     -38.9%       5650 ±  7%  cpuidle.C3-IVT.usage
   1302023 ±  2%     -15.8%    1096274 ±  1%  cpuidle.C6-IVT.usage
   8842237 ± 20%     -51.7%    4269878 ± 15%  cpuidle.POLL.time
    578.75 ±  6%    +640.7%       4286 ±  0%  numa-vmstat.node0.nr_alloc_batch
   9201698 ±  0%     -19.2%    7432378 ±  0%  numa-vmstat.node0.numa_hit
   9127517 ±  0%     -19.4%    7356249 ±  0%  numa-vmstat.node0.numa_local
    678.00 ±  3%    +569.4%       4538 ±  5%  numa-vmstat.node1.nr_alloc_batch
     42.41 ±  0%      +3.2%      43.75 ±  0%  turbostat.%Busy
      1270 ±  0%      +3.2%       1310 ±  0%  turbostat.Avg_MHz
      0.03 ±  0%     -33.3%       0.02 ±  0%  turbostat.CPU%c3
     15.59 ±  0%     -10.4%      13.97 ±  0%  turbostat.RAMWatt
      8.81 ±  5%     -41.9%       5.12 ± 54%  perf-profile.cycles-pp.call_cpuidle
      8.99 ±  5%     -41.4%       5.27 ± 52%  perf-profile.cycles-pp.cpu_startup_entry
      8.81 ±  5%     -42.0%       5.12 ± 54%  perf-profile.cycles-pp.cpuidle_enter
      8.18 ±  2%     -41.4%       4.80 ± 53%  perf-profile.cycles-pp.cpuidle_enter_state
      8.16 ±  2%     -41.3%       4.79 ± 53%  perf-profile.cycles-pp.intel_idle
      1169 ±  3%    +654.1%       8820 ±  1%  proc-vmstat.nr_alloc_batch
  28654816 ±  0%     -13.7%   24716699 ±  0%  proc-vmstat.numa_hit
  28645510 ±  0%     -13.7%   24707399 ±  0%  proc-vmstat.numa_local
 3.309e+08 ±  0%     -70.4%   97894895 ±  0%  proc-vmstat.pgalloc_dma32
  25869020 ±  0%     -13.6%   22338006 ±  0%  proc-vmstat.pgfault
 1.437e+09 ±  0%     -14.0%  1.236e+09 ±  0%  proc-vmstat.pgfree
   2755821 ±  0%     -14.0%    2369967 ±  0%  proc-vmstat.thp_deferred_split_page
   2757377 ±  0%     -14.0%    2371310 ±  0%  proc-vmstat.thp_fault_alloc





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.5.0-02575-g795ae7a" of type "text/plain" (151562 bytes)

View attachment "job.yaml" of type "text/plain" (3554 bytes)

View attachment "reproduce" of type "text/plain" (4550 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ