lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20180418010753.GA20825@yexl-desktop>
Date:   Wed, 18 Apr 2018 09:07:53 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Joonsoo Kim <iamjoonsoo.kim@....com>
Cc:     Stephen Rothwell <sfr@...b.auug.org.au>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Tony Lindgren <tony@...mide.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Johannes Weiner <hannes@...xchg.org>,
        Laura Abbott <lauraa@...eaurora.org>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Michal Hocko <mhocko@...e.com>,
        Michal Nazarewicz <mina86@...a86.com>,
        Minchan Kim <minchan@...nel.org>,
        Rik van Riel <riel@...hat.com>,
        Russell King <linux@...linux.org.uk>,
        Will Deacon <will.deacon@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp-robot] [mm/cma]  a57a290bd3:  vm-scalability.throughput -15.5%
 regression


Greeting,

FYI, we noticed a -15.5% regression of vm-scalability.throughput due to commit:


commit: a57a290bd38f64bde9b8f797600aee3925109061 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: vm-scalability
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
with following parameters:

	runtime: 300s
	test: lru-file-readonce
	cpufreq_governor: performance

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/300s/lkp-bdw-ep2/lru-file-readonce/vm-scalability

commit: 
  d92b1ec27c ("mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request")
  a57a290bd3 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")

d92b1ec27cae2c99 a57a290bd38f64bde9b8f79760 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    308089           -15.7%     259785        vm-scalability.median
      0.70 ± 56%    +696.1%       5.56 ±  4%  vm-scalability.stddev
  27142495           -15.5%   22927797        vm-scalability.throughput
    186.93           +23.9%     231.56        vm-scalability.time.elapsed_time
    186.93           +23.9%     231.56        vm-scalability.time.elapsed_time.max
    298206            +5.4%     314171        vm-scalability.time.involuntary_context_switches
      4022 ±  3%      -6.1%       3778        vm-scalability.time.maximum_resident_set_size
      7329            -4.5%       7003        vm-scalability.time.percent_of_cpu_this_job_got
     13376           +18.9%      15903        vm-scalability.time.system_time
    324.95            -3.2%     314.71        vm-scalability.time.user_time
      1345            +1.3%       1363        vm-scalability.time.voluntary_context_switches
  12905097           -21.5%   10131265 ±  3%  vmstat.memory.free
     16.12 ±  3%      +2.7       18.81        mpstat.cpu.idle%
      1.98            -0.4        1.56        mpstat.cpu.usr%
   7092821 ±  2%     -22.0%    5529796 ± 12%  numa-meminfo.node0.MemFree
   6466966 ±  2%     -21.4%    5082116 ± 11%  numa-meminfo.node1.MemFree
    467.06            -3.7%     449.96        pmeter.Average_Active_Power
     58116           -12.3%      50956        pmeter.performance_per_watt
      9546 ±  8%     +50.7%      14381 ± 27%  softirqs.NET_RX
    157856 ±  2%     +31.5%     207560 ±  3%  softirqs.SCHED
   5556773           +24.1%    6893563        softirqs.TIMER
    118309 ±  3%     +19.4%     141262        meminfo.Active
    117637 ±  3%     +19.6%     140648        meminfo.Active(anon)
     31240 ±  2%     -71.6%       8879 ± 25%  meminfo.CmaFree
  13541103           -22.3%   10519830 ±  3%  meminfo.MemFree
     97674 ±  4%     +25.1%     122193        meminfo.Shmem
     19884 ± 14%     +39.9%      27813 ± 12%  cpuidle.C1.usage
   4626065 ±  4%     +55.5%    7193168 ±  6%  cpuidle.C3.time
     16850 ±  6%     +45.7%      24551 ±  3%  cpuidle.C3.usage
 2.556e+09 ±  4%     +46.8%  3.752e+09        cpuidle.C6.time
   2612869 ±  4%     +47.0%    3841894        cpuidle.C6.usage
    286.50 ± 18%     +53.9%     441.00 ±  8%  cpuidle.POLL.usage
  12919583 ± 21%    +584.0%   88376315        numa-numastat.node0.numa_foreign
  13098152 ± 21%     +80.1%   23585358 ± 15%  numa-numastat.node0.numa_miss
  13110934 ± 21%     +80.0%   23593876 ± 15%  numa-numastat.node0.other_node
 5.278e+08           -17.0%  4.381e+08        numa-numastat.node1.local_node
  13098152 ± 21%     +80.1%   23585358 ± 15%  numa-numastat.node1.numa_foreign
 5.278e+08           -17.0%  4.381e+08        numa-numastat.node1.numa_hit
  12919583 ± 21%    +584.0%   88376315        numa-numastat.node1.numa_miss
  12923938 ± 21%    +583.9%   88384876        numa-numastat.node1.other_node
   1779873 ±  4%     -21.4%    1398188 ± 11%  numa-vmstat.node0.nr_free_pages
   6386636 ± 12%    +691.1%   50521587        numa-vmstat.node0.numa_foreign
      7869 ±  2%     -71.2%       2268 ± 25%  numa-vmstat.node1.nr_free_cma
   1613920           -20.6%    1281336 ± 11%  numa-vmstat.node1.nr_free_pages
    305.75 ±  7%     -15.3%     259.00 ±  3%  numa-vmstat.node1.nr_isolated_file
  3.11e+08           -17.7%   2.56e+08        numa-vmstat.node1.numa_hit
 3.108e+08           -17.7%  2.559e+08        numa-vmstat.node1.numa_local
   6388941 ± 12%    +691.0%   50534071        numa-vmstat.node1.numa_miss
   6564139 ± 12%    +672.6%   50713295        numa-vmstat.node1.numa_other
      2361            -3.4%       2281        turbostat.Avg_MHz
     18023 ± 13%     +43.9%      25927 ± 14%  turbostat.C1
     16372 ±  6%     +45.6%      23844 ±  5%  turbostat.C3
   2610248 ±  4%     +47.1%    3838846        turbostat.C6
     15.38 ±  2%      +2.9       18.30        turbostat.C6%
      4.11 ±  6%     +79.8%       7.38        turbostat.CPU%c1
  18087951 ±  2%     +22.2%   22101137        turbostat.IRQ
      5.18 ±  2%     -19.9%       4.15        turbostat.Pkg%pc2
    235.95            -3.4%     227.98        turbostat.PkgWatt
     24.09            -1.4%      23.76        turbostat.RAMWatt
   2839703 ±  7%     -31.7%    1939146 ± 13%  sched_debug.cfs_rq:/.min_vruntime.min
    397053 ±  4%     +25.8%     499507 ±  3%  sched_debug.cfs_rq:/.min_vruntime.stddev
      6.28 ±  2%     -16.4%       5.25 ±  7%  sched_debug.cfs_rq:/.nr_spread_over.avg
    192.50 ± 15%     -26.1%     142.25 ± 19%  sched_debug.cfs_rq:/.nr_spread_over.max
     27.64 ± 13%     -24.3%      20.92 ± 15%  sched_debug.cfs_rq:/.nr_spread_over.stddev
    -37835          +121.3%     -83735        sched_debug.cfs_rq:/.spread0.avg
  -2599772           +34.8%   -3503649        sched_debug.cfs_rq:/.spread0.min
    396913 ±  4%     +25.8%     499460 ±  3%  sched_debug.cfs_rq:/.spread0.stddev
     21.88 ± 27%    +114.6%      46.94 ±  6%  sched_debug.cfs_rq:/.util_est_enqueued.avg
    647.08 ± 19%     +25.4%     811.67 ±  5%  sched_debug.cfs_rq:/.util_est_enqueued.max
    103.42 ± 23%     +52.1%     157.30        sched_debug.cfs_rq:/.util_est_enqueued.stddev
    126173 ±  4%     -15.6%     106542 ± 10%  sched_debug.cpu.nr_switches.max
      1975 ±  3%     -23.1%       1519        sched_debug.cpu.nr_switches.min
    122036 ±  4%     -18.0%     100120 ± 10%  sched_debug.cpu.sched_count.max
      1720 ±  2%     -29.7%       1210        sched_debug.cpu.sched_count.min
     60707 ±  5%     -17.4%      50149 ± 10%  sched_debug.cpu.ttwu_count.max
    830.08 ±  3%     -41.9%     482.17 ±  2%  sched_debug.cpu.ttwu_count.min
     60276 ±  5%     -17.7%      49636 ± 11%  sched_debug.cpu.ttwu_local.max
    785.33 ±  3%     -42.8%     449.00 ±  2%  sched_debug.cpu.ttwu_local.min
 2.823e+12           +15.8%  3.268e+12        perf-stat.branch-instructions
      0.64            -0.1        0.56        perf-stat.branch-miss-rate%
      7.68            +0.5        8.17        perf-stat.cache-miss-rate%
 1.475e+10            +6.7%  1.573e+10        perf-stat.cache-misses
   1689793 ±  3%     +21.3%    2049799 ±  3%  perf-stat.context-switches
      2.97            +5.0%       3.12        perf-stat.cpi
 3.878e+13           +19.4%  4.631e+13        perf-stat.cpu-cycles
     11366           +25.0%      14205        perf-stat.cpu-migrations
 3.505e+12           +13.0%  3.959e+12        perf-stat.dTLB-loads
 1.306e+13           +13.7%  1.485e+13        perf-stat.instructions
      4324 ±  6%     +17.0%       5062 ±  6%  perf-stat.instructions-per-iTLB-miss
      0.34            -4.8%       0.32        perf-stat.ipc
    529440           +18.3%     626390        perf-stat.minor-faults
     46.54 ±  3%     +15.3       61.85        perf-stat.node-load-miss-rate%
 8.175e+08 ±  6%     +68.9%  1.381e+09        perf-stat.node-load-misses
 9.372e+08            -9.1%  8.517e+08        perf-stat.node-loads
     13.14 ±  4%     +10.2       23.31        perf-stat.node-store-miss-rate%
 8.107e+08 ±  5%     +84.7%  1.497e+09        perf-stat.node-store-misses
 5.359e+09            -8.1%  4.925e+09        perf-stat.node-stores
    529444           +18.3%     626391        perf-stat.page-faults
      3040           +13.7%       3458        perf-stat.path-length
    126389           -13.8%     108944        proc-vmstat.allocstall_movable
      1015 ±  4%     +29.0%       1309 ±  5%  proc-vmstat.allocstall_normal
      1672 ±106%    +360.9%       7708 ± 49%  proc-vmstat.compact_migrate_scanned
      2597 ±  4%     -31.8%       1771 ±  6%  proc-vmstat.kswapd_low_wmark_hit_quickly
     29424 ±  3%     +19.9%      35279        proc-vmstat.nr_active_anon
      7830 ±  3%     -71.7%       2216 ± 26%  proc-vmstat.nr_free_cma
   3340670 ±  2%     -21.6%    2618440        proc-vmstat.nr_free_pages
    610.75 ±  2%     -10.1%     549.00        proc-vmstat.nr_isolated_file
     24391 ±  4%     +25.6%      30642        proc-vmstat.nr_shmem
     29425 ±  3%     +19.9%      35285        proc-vmstat.nr_zone_active_anon
  26017735 ± 21%    +330.3%   1.12e+08 ±  2%  proc-vmstat.numa_foreign
      1247 ± 13%     +26.3%       1575 ±  4%  proc-vmstat.numa_hint_faults
  26017735 ± 21%    +330.3%   1.12e+08 ±  2%  proc-vmstat.numa_miss
  26034889 ± 21%    +330.1%   1.12e+08 ±  2%  proc-vmstat.numa_other
      2602 ±  4%     -31.5%       1782 ±  7%  proc-vmstat.pageoutrun
    544184           +18.3%     643659        proc-vmstat.pgfault
 9.827e+08           -13.3%  8.519e+08        proc-vmstat.pgscan_direct
  59757667 ± 12%    +218.9%  1.906e+08 ±  2%  proc-vmstat.pgscan_kswapd
 9.827e+08           -13.3%  8.518e+08        proc-vmstat.pgsteal_direct
  59757604 ± 12%    +218.9%  1.906e+08 ±  2%  proc-vmstat.pgsteal_kswapd


                                                                                
                               vm-scalability.throughput                        
                                                                                
  2.8e+07 +-+---------------------------------------------------------------+   
          |  +.+.      +   ++.++.+. +. + +   ++. .+  +.+ .+.+           .+  |   
  2.7e+07 +-+    + .+.+            +  +         +       +    +.      +.+  +.|   
          |       +                                            +     :      |   
          |                                                     +.+.+       |   
  2.6e+07 +-+                                                               |   
          |                                                                 |   
  2.5e+07 +-+                                                               |   
          |                                                                 |   
  2.4e+07 +-+                                                               |   
          |                                                                 |   
          O                   O         O                                   |   
  2.3e+07 +-OO O  O         O  O   O  O  O O                                |   
          |      O  O OO O O     O  O                                       |   
  2.2e+07 +-+---------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                vm-scalability.median                           
                                                                                
  320000 +-+----------------------------------------------------------------+   
         |.++.+        .+.++.+.++. .+  +.+.++.+  +.+.++                     |   
  310000 +-+   +  .+.++           +  :+        + :     + .++.        +. .++ |   
         |      ++                   +          +       +    +.+     : +   +|   
  300000 +-+                                                    +.+.+       |   
         |                                                                  |   
  290000 +-+                                                                |   
         |                                                                  |   
  280000 +-+                                                                |   
         |                                                                  |   
  270000 +-+                                                                |   
         |                             O                                    |   
  260000 O-OO                O       O   O                                  |   
         |    O OO O OO O OO   OO O O      O                                |   
  250000 +-+----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.16.0-11359-ga57a290" of type "text/plain" (164013 bytes)

View attachment "job-script" of type "text/plain" (7425 bytes)

View attachment "job.yaml" of type "text/plain" (5026 bytes)

View attachment "reproduce" of type "text/plain" (8821 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ