lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Fri, 30 Mar 2018 09:27:21 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Joonsoo Kim <iamjoonsoo.kim@....com>
Cc:     Stephen Rothwell <sfr@...b.auug.org.au>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Tony Lindgren <tony@...mide.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Johannes Weiner <hannes@...xchg.org>,
        Laura Abbott <lauraa@...eaurora.org>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Michal Hocko <mhocko@...e.com>,
        Michal Nazarewicz <mina86@...a86.com>,
        Minchan Kim <minchan@...nel.org>,
        Rik van Riel <riel@...hat.com>,
        Russell King <linux@...linux.org.uk>,
        Will Deacon <will.deacon@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp-robot] [mm/cma]  4405c5fd84:  vm-scalability.throughput +26.5%
 improvement


Greeting,

FYI, we noticed a +26.5% improvement of vm-scalability.throughput due to commit:


commit: 4405c5fd8434809972dd2996c4dbfe5124b01d55 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: vm-scalability
on test machine: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory
with following parameters:

	runtime: 300
	thp_enabled: never
	thp_defrag: always
	nr_task: 8
	nr_pmem: 4
	priority: 1
	test: swap-w-seq-mt
	cpufreq_governor: performance

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_pmem/nr_task/priority/rootfs/runtime/tbox_group/test/testcase/thp_defrag/thp_enabled:
  gcc-7/performance/x86_64-rhel-7.2/4/8/1/debian-x86_64-2016-08-31.cgz/300/lkp-hsw-ep2/swap-w-seq-mt/vm-scalability/always/never

commit: 
  41fd9c44d9 ("mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request")
  4405c5fd84 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")

41fd9c44d94101ec 4405c5fd8434809972dd2996c4 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   3278698           +26.5%    4148766 ±  2%  vm-scalability.throughput
      0.76 ±  2%      -7.9%       0.70 ±  2%  vm-scalability.free_time
    414083           +28.8%     533402 ±  2%  vm-scalability.median
     37.39           -11.2%      33.19        vm-scalability.time.elapsed_time
     37.39           -11.2%      33.19        vm-scalability.time.elapsed_time.max
    646.75           -10.1%     581.25        vm-scalability.time.percent_of_cpu_this_job_got
    182.59           -24.6%     137.76 ±  2%  vm-scalability.time.system_time
     59.42            -6.8%      55.40        vm-scalability.time.user_time
      8942 ±  9%    +105.6%      18384 ±  6%  vm-scalability.time.voluntary_context_switches
   2089321 ±  5%     -26.6%    1532791 ±  3%  cpuidle.C1.time
  62485028           -32.1%   42426989 ±  4%  interrupts.CAL:Function_call_interrupts
      9.87            -2.0        7.89        mpstat.cpu.sys%
    859.75 ± 14%     +54.6%       1329 ± 34%  slabinfo.dmaengine-unmap-16.active_objs
    859.75 ± 14%     +54.6%       1329 ± 34%  slabinfo.dmaengine-unmap-16.num_objs
   2691121 ±  9%     -31.9%    1832052 ±  8%  softirqs.RCU
    565376           -10.6%     505518 ±  2%  softirqs.TIMER
   1798289 ± 20%    +219.4%    5744223 ±  6%  numa-numastat.node0.numa_foreign
  11379510 ±  6%     -30.9%    7864229 ±  5%  numa-numastat.node1.local_node
  11380678 ±  6%     -30.9%    7867707 ±  5%  numa-numastat.node1.numa_hit
   1798289 ± 20%    +219.4%    5744223 ±  6%  numa-numastat.node1.numa_miss
   1799458 ± 20%    +219.4%    5747704 ±  6%  numa-numastat.node1.other_node
 7.148e+08 ±  4%     -35.5%  4.607e+08 ±  6%  perf-node.node-load-misses
 2.806e+08 ±  2%     -25.9%  2.079e+08 ±  7%  perf-node.node-loads
     27.75 ±  5%      +9.9%      30.50 ±  3%  perf-node.node-local-load-ratio
 2.166e+08 ±  3%     -27.1%  1.579e+08 ±  8%  perf-node.node-store-misses
 2.734e+08 ±  9%     -43.9%  1.534e+08 ± 10%  perf-node.node-stores
    447.75 ±  2%     -10.8%     399.25 ±  2%  turbostat.Avg_MHz
     15.16 ±  2%      -1.8       13.36        turbostat.Busy%
      0.07 ±  5%      -0.0        0.06 ±  7%  turbostat.C1%
 1.875e+08           -33.8%  1.242e+08 ±  4%  turbostat.IRQ
    152.01            -2.7%     147.85        turbostat.PkgWatt
     14.18            -2.3%      13.85        turbostat.RAMWatt
    586.75 ±  7%     +13.5%     666.25 ±  3%  vmstat.memory.buff
   5757201 ±  3%     +22.8%    7068800 ±  4%  vmstat.memory.free
  27365780           -39.1%   16652327 ±  2%  vmstat.memory.swpd
      9.00           -22.2%       7.00        vmstat.procs.r
    829.00 ± 13%     -38.6%     509.00 ±  5%  vmstat.swap.si
   1658990           -26.7%    1215747        vmstat.swap.so
   1636758           -23.0%    1260916        vmstat.system.in
     47862 ±  6%     -65.1%      16689 ±  6%  meminfo.CmaFree
  83417008            -9.9%   75152314        meminfo.Committed_AS
   1850235           -19.2%    1494102 ±  4%  meminfo.Inactive
   1849617           -19.3%    1493422 ±  4%  meminfo.Inactive(anon)
   5362375 ±  3%     +22.4%    6564299        meminfo.MemAvailable
   5534597 ±  2%     +22.3%    6766142        meminfo.MemFree
    103923           -21.3%      81789        meminfo.PageTables
  73933617           +13.6%   83992710        meminfo.SwapFree
  11799653            -8.0%   10850808 ±  3%  numa-meminfo.node0.AnonPages
    937028           -19.4%     755072 ±  6%  numa-meminfo.node0.Inactive
    936810           -19.4%     754801 ±  6%  numa-meminfo.node0.Inactive(anon)
   2741438 ±  8%     +35.6%    3716602 ±  9%  numa-meminfo.node0.MemFree
     59023           -51.4%      28701 ±  9%  numa-meminfo.node0.PageTables
    921828 ±  2%     -20.4%     733395 ±  5%  numa-meminfo.node1.Inactive
    921425 ±  2%     -20.5%     732971 ±  5%  numa-meminfo.node1.Inactive(anon)
     45693           +17.0%      53445 ±  5%  numa-meminfo.node1.PageTables
      8923 ±  4%     +66.6%      14863 ± 25%  sched_debug.cfs_rq:/.min_vruntime.avg
     16889 ± 10%     +36.7%      23084 ± 14%  sched_debug.cfs_rq:/.min_vruntime.max
      2110 ±  9%     +22.4%       2584 ± 11%  sched_debug.cfs_rq:/.min_vruntime.stddev
      2287 ± 30%    +205.4%       6985 ± 10%  sched_debug.cfs_rq:/.spread0.avg
     10329 ± 13%     +47.8%      15262 ±  5%  sched_debug.cfs_rq:/.spread0.max
      2119 ±  8%     +21.2%       2569 ± 10%  sched_debug.cfs_rq:/.spread0.stddev
      1223 ± 17%     +25.1%       1531 ± 21%  sched_debug.cfs_rq:/.util_avg.max
      3392 ± 36%    +534.3%      21519 ± 58%  sched_debug.cpu.avg_idle.min
   2929484 ±  2%      -8.5%    2679073 ±  3%  numa-vmstat.node0.nr_anon_pages
    705776 ±  9%     +36.5%     963258 ±  6%  numa-vmstat.node0.nr_free_pages
    232015           -20.0%     185568 ±  7%  numa-vmstat.node0.nr_inactive_anon
     11856 ±  6%     +23.8%      14680 ±  9%  numa-vmstat.node0.nr_indirectly_reclaimable
     14587           -51.7%       7041 ±  7%  numa-vmstat.node0.nr_page_table_pages
   3831258 ±  3%     -25.8%    2842415 ± 11%  numa-vmstat.node0.nr_vmscan_write
   3831285 ±  3%     -25.8%    2842319 ± 11%  numa-vmstat.node0.nr_written
    232105           -20.0%     185668 ±  7%  numa-vmstat.node0.nr_zone_inactive_anon
    942274 ± 14%    +268.2%    3469716 ±  7%  numa-vmstat.node0.numa_foreign
     11986 ±  5%     -64.6%       4238 ±  8%  numa-vmstat.node1.nr_free_cma
    228092 ±  2%     -21.5%     179096 ±  6%  numa-vmstat.node1.nr_inactive_anon
     11051 ±  6%     -25.6%       8224 ± 17%  numa-vmstat.node1.nr_indirectly_reclaimable
      2930 ±  9%     -11.1%       2604 ± 10%  numa-vmstat.node1.nr_mapped
     11307 ±  2%     +17.1%      13236 ±  6%  numa-vmstat.node1.nr_page_table_pages
   4507408 ±  3%     -19.7%    3619831 ± 10%  numa-vmstat.node1.nr_vmscan_write
   4507421 ±  3%     -19.7%    3619853 ± 10%  numa-vmstat.node1.nr_written
    228157 ±  2%     -21.5%     179161 ±  6%  numa-vmstat.node1.nr_zone_inactive_anon
   7544972 ±  3%     -36.3%    4806885 ±  9%  numa-vmstat.node1.numa_hit
   7381155 ±  3%     -37.1%    4641005 ±  9%  numa-vmstat.node1.numa_local
    943340 ± 14%    +268.0%    3471701 ±  7%  numa-vmstat.node1.numa_miss
   1107163 ± 12%    +228.6%    3637587 ±  7%  numa-vmstat.node1.numa_other
      0.77 ±  2%      -0.2        0.58 ±  4%  perf-stat.branch-miss-rate%
 1.461e+09           -23.3%  1.121e+09 ±  6%  perf-stat.branch-misses
     25.39            -1.6       23.82 ±  2%  perf-stat.cache-miss-rate%
 1.492e+09 ±  2%     -26.0%  1.104e+09 ±  7%  perf-stat.cache-misses
 5.877e+09           -21.2%  4.634e+09 ±  6%  perf-stat.cache-references
      1.63           -10.3%       1.47        perf-stat.cpi
 1.223e+12 ±  2%     -12.0%  1.075e+12 ±  7%  perf-stat.cpu-cycles
      1.01 ±  2%      -0.2        0.85 ±  6%  perf-stat.dTLB-load-miss-rate%
 1.873e+09 ± 10%     -24.1%  1.423e+09 ±  6%  perf-stat.dTLB-load-misses
 1.038e+11 ±  2%     -13.8%   8.95e+10 ±  7%  perf-stat.dTLB-stores
 5.226e+08 ±  5%     -21.4%  4.108e+08        perf-stat.iTLB-load-misses
  90580077 ±  5%     -13.8%   78079215 ±  4%  perf-stat.iTLB-loads
      1434 ±  5%     +24.6%       1787 ±  8%  perf-stat.instructions-per-iTLB-miss
      0.61           +11.5%       0.68        perf-stat.ipc
      5290 ±  3%     -17.7%       4356 ±  5%  perf-stat.major-faults
     71.12            -2.3       68.81        perf-stat.node-load-miss-rate%
 7.031e+08 ±  4%     -33.4%  4.681e+08 ±  6%  perf-stat.node-load-misses
 2.853e+08 ±  3%     -25.6%  2.122e+08 ±  7%  perf-stat.node-loads
     45.16 ±  6%      +4.0       49.16 ±  3%  perf-stat.node-store-miss-rate%
 2.153e+08 ±  3%     -29.0%  1.529e+08        perf-stat.node-store-misses
 2.634e+08 ± 11%     -39.9%  1.584e+08 ±  6%  perf-stat.node-stores
    119681 ±  3%     -27.8%      86457 ±  5%  proc-vmstat.allocstall_movable
     40547 ± 14%     -53.6%      18814 ± 31%  proc-vmstat.allocstall_normal
    629.25 ± 93%     -99.7%       2.00 ± 61%  proc-vmstat.compact_stall
     18.75 ± 31%    +212.0%      58.50 ± 89%  proc-vmstat.kswapd_high_wmark_hit_quickly
    129618 ±  3%     +26.4%     163840 ±  3%  proc-vmstat.nr_dirty_background_threshold
    259555 ±  3%     +26.4%     328082 ±  3%  proc-vmstat.nr_dirty_threshold
     11625 ±  5%     -64.0%       4188 ±  8%  proc-vmstat.nr_free_cma
   1344481 ±  3%     +26.2%    1696756 ±  3%  proc-vmstat.nr_free_pages
    465487           -19.8%     373471 ±  3%  proc-vmstat.nr_inactive_anon
    189.25 ±  5%     -24.7%     142.50 ±  9%  proc-vmstat.nr_isolated_anon
     26266           -22.0%      20495        proc-vmstat.nr_page_table_pages
   8458281 ±  2%     -23.7%    6456268 ±  4%  proc-vmstat.nr_vmscan_write
  16689767           -34.4%   10944327 ±  5%  proc-vmstat.nr_written
    465637           -19.8%     373643 ±  3%  proc-vmstat.nr_zone_inactive_anon
   4610783 ±  2%     +72.2%    7938811        proc-vmstat.numa_foreign
  20568554           -17.2%   17030926        proc-vmstat.numa_hit
  20554622           -17.2%   17017000        proc-vmstat.numa_local
   4610783 ±  2%     +72.2%    7938811        proc-vmstat.numa_miss
   4624715 ±  2%     +72.0%    7952737        proc-vmstat.numa_other
  11566637 ±  2%     -14.0%    9944302 ±  5%  proc-vmstat.numa_pte_updates
   1439116 ±  2%     -24.9%    1080676 ± 11%  proc-vmstat.pgalloc_dma32
  17284194           -33.1%   11570103 ±  5%  proc-vmstat.pgdeactivate
      9.00 ± 19%  +15077.8%       1366 ± 77%  proc-vmstat.pgmigrate_success
  17280106           -32.9%   11600628 ±  5%  proc-vmstat.pgrefill
  12136735 ± 12%     -38.3%    7491459 ± 14%  proc-vmstat.pgscan_direct
   6262493           -30.7%    4340098 ±  5%  proc-vmstat.pgscan_kswapd
  10431149 ±  2%     -36.6%    6608259 ±  5%  proc-vmstat.pgsteal_direct
   6260464           -30.7%    4337402 ±  5%  proc-vmstat.pgsteal_kswapd
      6045 ± 17%     -47.4%       3179 ± 23%  proc-vmstat.pswpin
  16693811           -34.4%   10948374 ±  5%  proc-vmstat.pswpout
     33044 ±  6%     -20.3%      26344 ±  4%  proc-vmstat.slabs_scanned
     


                               vm-scalability.throughput                        
                                                                                
  4.4e+06 +-+---------------------------------------------------------------+   
          |     O                                                           |   
  4.2e+06 +-O O   O O   O    O     O O O O O                                |   
          O           O          O           O                              |   
          |                                                                 |   
    4e+06 +-+                                  O                            |   
          |                O   O                                            |   
  3.8e+06 +-+                                                               |   
          |                                                                 |   
  3.6e+06 +-+                                                               |   
          |                                                                 |   
          |                     .+                                          |   
  3.4e+06 +-+                 .+  +                 .+.+.+.. .+.+.+.        |   
          |  + .+.+.+.+.+..+.+     +.+.+.+.   .+. .+        +       +.+. .+.|   
  3.2e+06 +-+---------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.16.0-rc6-00268-g4405c5f" of type "text/plain" (165965 bytes)

View attachment "job-script" of type "text/plain" (7759 bytes)

View attachment "job.yaml" of type "text/plain" (5315 bytes)

View attachment "reproduce" of type "text/plain" (1085 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ