lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200901065019.GJ4299@shao2-debian>
Date:   Tue, 1 Sep 2020 14:50:19 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Joonsoo Kim <iamjoonsoo.kim@....com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Hugh Dickins <hughd@...gle.com>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Michal Hocko <mhocko@...nel.org>,
        Minchan Kim <minchan@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...el.com
Subject: [mm/workingset] 170b04b7ae: vm-scalability.throughput 11.2%
 improvement

Greeting,

FYI, we noticed a 11.2% improvement of vm-scalability.throughput due to commit:


commit: 170b04b7ae49634df103810dad67b22cf8a99aa6 ("mm/workingset: prepare the workingset detection infrastructure for anon LRU")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: vm-scalability
on test machine: 104 threads Skylake with 192G memory
with following parameters:

	runtime: 300s
	size: 1T
	test: lru-shm
	cpufreq_governor: performance
	ucode: 0x2006906

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/1T/lkp-skl-fpga01/lru-shm/vm-scalability/0x2006906

commit: 
  b518154e59 ("mm/vmscan: protect the workingset on anonymous LRU")
  170b04b7ae ("mm/workingset: prepare the workingset detection infrastructure for anon LRU")

b518154e59aab3ad 170b04b7ae49634df103810dad6 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.05 ±  5%     -26.6%       0.03 ±  3%  vm-scalability.free_time
    417392           +12.3%     468921        vm-scalability.median
  43811736           +11.2%   48709267        vm-scalability.throughput
    243.55            -3.7%     234.62        vm-scalability.time.elapsed_time
    243.55            -3.7%     234.62        vm-scalability.time.elapsed_time.max
     57267 ± 10%     -29.6%      40324        vm-scalability.time.involuntary_context_switches
      2344            -7.0%       2180        vm-scalability.time.percent_of_cpu_this_job_got
      3981           -20.7%       3155 ±  4%  vm-scalability.time.system_time
      1730           +13.3%       1960 ±  6%  vm-scalability.time.user_time
     75.25            +2.3%      77.00        vmstat.cpu.id
     16.12            -2.8       13.32 ±  4%  mpstat.cpu.all.sys%
      6.96            +1.2        8.15 ±  6%  mpstat.cpu.all.usr%
   5511015 ±  5%     -12.2%    4837482 ±  8%  numa-meminfo.node0.Mapped
     11416 ±  2%     -13.5%       9871 ± 12%  numa-meminfo.node0.PageTables
     17772 ±  4%     -10.2%      15957        meminfo.Active
     17617 ±  4%     -10.4%      15790        meminfo.Active(anon)
  10864238            -9.0%    9886317        meminfo.Mapped
   1386943 ±  4%     -14.6%    1185103 ± 12%  numa-vmstat.node0.nr_mapped
      2842 ±  2%     -14.4%       2433 ± 10%  numa-vmstat.node0.nr_page_table_pages
     17056 ± 27%    +346.2%      76106 ± 63%  numa-vmstat.node0.numa_other
   1424454 ±  6%  +19661.4%  2.815e+08 ±142%  cpuidle.C1.time
     84757 ±  4%   +4598.2%    3982093 ±150%  cpuidle.C1.usage
    130831 ±  2%    +230.4%     432228 ± 94%  cpuidle.POLL.time
     37049 ±  2%     +90.7%      70640 ± 34%  cpuidle.POLL.usage
      1415 ±  7%     +19.9%       1696 ±  6%  slabinfo.dmaengine-unmap-16.active_objs
      1415 ±  7%     +19.9%       1696 ±  6%  slabinfo.dmaengine-unmap-16.num_objs
      3302 ±  4%     -15.0%       2808 ±  5%  slabinfo.fsnotify_mark_connector.active_objs
      3302 ±  4%     -15.0%       2808 ±  5%  slabinfo.fsnotify_mark_connector.num_objs
      4403 ±  5%     -10.3%       3949        proc-vmstat.nr_active_anon
  12556993            -0.7%   12471053        proc-vmstat.nr_inactive_anon
   2697260            -9.9%    2430793 ±  5%  proc-vmstat.nr_mapped
      5545            -7.6%       5122 ±  5%  proc-vmstat.nr_page_table_pages
  12490172            -0.7%   12404206        proc-vmstat.nr_shmem
    234365            +3.1%     241704        proc-vmstat.nr_unevictable
      4403 ±  5%     -10.3%       3949        proc-vmstat.nr_zone_active_anon
  12556992            -0.7%   12471053        proc-vmstat.nr_zone_inactive_anon
    234365            +3.1%     241704        proc-vmstat.nr_zone_unevictable
    178.00 ±  4%   +2661.2%       4915 ±156%  interrupts.41:PCI-MSI.67633156-edge.eth0-TxRx-3
      2366 ±  7%     -14.0%       2036 ±  3%  interrupts.CPU0.NMI:Non-maskable_interrupts
      2366 ±  7%     -14.0%       2036 ±  3%  interrupts.CPU0.PMI:Performance_monitoring_interrupts
      1108 ± 41%     -51.8%     534.50 ±  5%  interrupts.CPU1.CAL:Function_call_interrupts
    588.25 ± 16%     -16.5%     491.25 ±  3%  interrupts.CPU17.CAL:Function_call_interrupts
    159.50 ± 51%     -37.9%      99.00 ±  9%  interrupts.CPU22.RES:Rescheduling_interrupts
    118.25 ± 12%     -20.7%      93.75 ± 17%  interrupts.CPU25.RES:Rescheduling_interrupts
      2260 ±  2%     -22.1%       1760 ± 24%  interrupts.CPU26.NMI:Non-maskable_interrupts
      2260 ±  2%     -22.1%       1760 ± 24%  interrupts.CPU26.PMI:Performance_monitoring_interrupts
    178.00 ±  4%   +2661.2%       4915 ±156%  interrupts.CPU33.41:PCI-MSI.67633156-edge.eth0-TxRx-3
    191.75 ± 58%     -63.0%      71.00 ± 10%  interrupts.CPU36.RES:Rescheduling_interrupts
    140.75 ± 30%     -41.9%      81.75 ± 27%  interrupts.CPU40.RES:Rescheduling_interrupts
    224.25 ± 72%     -62.1%      85.00 ± 32%  interrupts.CPU48.RES:Rescheduling_interrupts
    726.75 ± 19%     -29.9%     509.50 ±  5%  interrupts.CPU52.CAL:Function_call_interrupts
    734.25 ± 22%     -22.9%     566.00 ± 19%  interrupts.CPU53.CAL:Function_call_interrupts
    939.00 ± 31%     -37.3%     589.00 ±  4%  interrupts.CPU54.CAL:Function_call_interrupts
    710.25 ± 17%     -24.7%     534.50 ± 11%  interrupts.CPU56.CAL:Function_call_interrupts
    648.75 ± 17%     -22.5%     502.50 ±  4%  interrupts.CPU61.CAL:Function_call_interrupts
    790.75 ± 30%     -30.7%     547.75 ± 16%  interrupts.CPU65.CAL:Function_call_interrupts
      2389 ± 11%     -20.4%       1902        interrupts.CPU76.NMI:Non-maskable_interrupts
      2389 ± 11%     -20.4%       1902        interrupts.CPU76.PMI:Performance_monitoring_interrupts
      2959 ± 42%     -35.8%       1900        interrupts.CPU83.NMI:Non-maskable_interrupts
      2959 ± 42%     -35.8%       1900        interrupts.CPU83.PMI:Performance_monitoring_interrupts
    879.50 ±137%     -87.4%     110.75 ± 36%  interrupts.CPU84.RES:Rescheduling_interrupts
      2120 ± 17%     -41.0%       1252 ± 42%  interrupts.CPU85.NMI:Non-maskable_interrupts
      2120 ± 17%     -41.0%       1252 ± 42%  interrupts.CPU85.PMI:Performance_monitoring_interrupts
     94.00 ± 14%     +42.8%     134.25 ± 20%  interrupts.CPU85.RES:Rescheduling_interrupts
    172.75 ± 46%     -36.9%     109.00 ± 38%  interrupts.CPU89.RES:Rescheduling_interrupts
      1658 ± 87%     -69.2%     510.50 ±  9%  interrupts.CPU9.CAL:Function_call_interrupts
     23465 ± 19%     -45.1%      12876 ±  4%  interrupts.RES:Rescheduling_interrupts
 1.377e+10            +3.2%  1.422e+10        perf-stat.i.branch-instructions
  48655466            +5.7%   51418236        perf-stat.i.cache-misses
 6.967e+10            -7.5%  6.442e+10        perf-stat.i.cpu-cycles
      1313            -5.1%       1247        perf-stat.i.cycles-between-cache-misses
 1.402e+10            +3.1%  1.446e+10        perf-stat.i.dTLB-loads
   2255710            +3.8%    2341501        perf-stat.i.dTLB-store-misses
 3.814e+09            +3.6%   3.95e+09        perf-stat.i.dTLB-stores
   3316652           +62.5%    5387931 ±  3%  perf-stat.i.iTLB-load-misses
 4.976e+10            +3.1%  5.132e+10        perf-stat.i.instructions
      0.67            -7.7%       0.62        perf-stat.i.metric.GHz
    305.99            +3.1%     315.37        perf-stat.i.metric.M/sec
   2118563            +3.9%    2200504        perf-stat.i.minor-faults
   7903787            +4.1%    8229701        perf-stat.i.node-stores
   2118563            +3.9%    2200504        perf-stat.i.page-faults
      1.40           -10.4%       1.26        perf-stat.overall.cpi
      1432           -12.5%       1252        perf-stat.overall.cycles-between-cache-misses
     39.88 ±  2%     +12.5       52.34 ±  3%  perf-stat.overall.iTLB-load-miss-rate%
     15005           -36.5%       9534 ±  3%  perf-stat.overall.instructions-per-iTLB-miss
      0.71           +11.6%       0.80        perf-stat.overall.ipc
 1.377e+10            +3.0%  1.419e+10        perf-stat.ps.branch-instructions
  48656706            +5.5%   51312384        perf-stat.ps.cache-misses
 6.968e+10            -7.7%  6.429e+10        perf-stat.ps.cpu-cycles
 1.402e+10            +2.9%  1.442e+10        perf-stat.ps.dTLB-loads
   2255925            +3.6%    2336219        perf-stat.ps.dTLB-store-misses
 3.811e+09            +3.4%   3.94e+09        perf-stat.ps.dTLB-stores
   3316265           +62.1%    5375280 ±  3%  perf-stat.ps.iTLB-load-misses
 4.975e+10            +2.9%   5.12e+10        perf-stat.ps.instructions
   2119230            +3.6%    2195912        perf-stat.ps.minor-faults
   7906342            +3.9%    8212206        perf-stat.ps.node-stores
   2119230            +3.6%    2195912        perf-stat.ps.page-faults
     10352 ±  2%     +11.6%      11557 ±  6%  softirqs.CPU100.RCU
     10309 ±  2%     +13.9%      11740 ± 10%  softirqs.CPU101.RCU
     10693           +18.0%      12621 ± 10%  softirqs.CPU15.RCU
     10974 ±  2%      +9.5%      12021 ±  3%  softirqs.CPU17.RCU
     10715 ±  4%     +11.6%      11958 ±  5%  softirqs.CPU19.RCU
     11394 ±  2%      +6.8%      12171 ±  5%  softirqs.CPU2.RCU
     10500 ±  5%     +18.0%      12387 ±  8%  softirqs.CPU21.RCU
     10393 ±  4%     +13.8%      11830 ±  3%  softirqs.CPU23.RCU
     10377 ±  9%     +14.0%      11832 ±  5%  softirqs.CPU25.RCU
     10270 ±  6%     +17.1%      12023 ± 11%  softirqs.CPU29.RCU
     11134 ±  3%     +14.1%      12700 ± 10%  softirqs.CPU32.RCU
     10837 ±  3%     +10.2%      11940 ±  3%  softirqs.CPU37.RCU
     25056 ±  3%     -10.7%      22376 ± 10%  softirqs.CPU4.SCHED
     10758 ±  2%     +13.6%      12226 ±  4%  softirqs.CPU41.RCU
     10704 ±  2%     +14.5%      12257 ±  9%  softirqs.CPU45.RCU
     10480 ±  6%      +8.1%      11327 ±  4%  softirqs.CPU47.RCU
     10427 ±  3%      +8.9%      11359 ±  6%  softirqs.CPU48.RCU
     10105 ±  4%     +25.4%      12673 ± 14%  softirqs.CPU49.RCU
     10258 ±  4%      +9.3%      11210 ±  4%  softirqs.CPU52.RCU
     11962 ± 18%     -16.5%       9990 ±  5%  softirqs.CPU53.RCU
     10278           +11.4%      11454 ±  3%  softirqs.CPU58.RCU
     10115 ±  3%     +11.6%      11288 ±  4%  softirqs.CPU59.RCU
     10227 ±  3%     +13.7%      11624 ±  5%  softirqs.CPU60.RCU
     10524 ±  5%      +8.5%      11423 ±  4%  softirqs.CPU62.RCU
     10546 ±  3%     +11.8%      11790 ±  4%  softirqs.CPU64.RCU
     10005 ±  3%     +13.7%      11378 ±  4%  softirqs.CPU65.RCU
     10201 ±  2%     +17.3%      11969 ±  4%  softirqs.CPU66.RCU
     10367 ±  2%     +12.3%      11637 ±  2%  softirqs.CPU67.RCU
     10233 ±  4%     +16.5%      11920 ±  3%  softirqs.CPU68.RCU
     10701 ±  4%      +8.6%      11623 ±  3%  softirqs.CPU71.RCU
     10005 ±  4%     +13.7%      11374 ±  2%  softirqs.CPU72.RCU
      9034 ±  3%     +14.8%      10368        softirqs.CPU75.RCU
      9290 ±  3%     +11.8%      10386 ±  3%  softirqs.CPU76.RCU
     10645 ±  5%     +14.3%      12167 ±  2%  softirqs.CPU81.RCU
     10657 ±  3%     +11.5%      11884 ±  3%  softirqs.CPU82.RCU
     10582           +35.4%      14324 ±  7%  softirqs.CPU83.RCU
     10474 ±  4%     +24.0%      12983 ± 19%  softirqs.CPU86.RCU
     10093 ±  2%     +16.8%      11788 ± 11%  softirqs.CPU92.RCU
     10275           +14.4%      11757 ±  8%  softirqs.CPU95.RCU
     10499 ±  4%     +14.8%      12057 ±  9%  softirqs.CPU96.RCU
     10228           +16.6%      11931 ± 10%  softirqs.CPU97.RCU
   1120806 ±  2%      +9.8%    1230893 ±  3%  softirqs.RCU
     27146 ± 10%     -43.0%      15461 ± 17%  sched_debug.cfs_rq:/.exec_clock.avg
     35739 ±  9%     -33.4%      23815 ± 13%  sched_debug.cfs_rq:/.exec_clock.max
     24614 ± 11%     -43.1%      14012 ± 17%  sched_debug.cfs_rq:/.exec_clock.min
      2252 ± 14%     -28.3%       1613 ±  8%  sched_debug.cfs_rq:/.exec_clock.stddev
     26221 ± 25%     -48.2%      13579 ± 29%  sched_debug.cfs_rq:/.load.avg
     31.57 ± 21%     -38.4%      19.45 ± 14%  sched_debug.cfs_rq:/.load_avg.avg
    594.11 ± 16%     -18.6%     483.79 ±  5%  sched_debug.cfs_rq:/.load_avg.max
    102.33 ± 16%     -29.0%      72.67 ± 11%  sched_debug.cfs_rq:/.load_avg.stddev
   2729295 ± 10%     -43.0%    1556498 ± 17%  sched_debug.cfs_rq:/.min_vruntime.avg
   2841927 ±  9%     -43.0%    1620688 ± 17%  sched_debug.cfs_rq:/.min_vruntime.max
   2545899 ± 11%     -42.6%    1461699 ± 17%  sched_debug.cfs_rq:/.min_vruntime.min
     75990 ± 36%     -56.0%      33447 ± 33%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.34 ± 10%     -30.0%       0.24 ± 16%  sched_debug.cfs_rq:/.nr_running.avg
     20.09 ± 12%     -43.1%      11.43 ± 19%  sched_debug.cfs_rq:/.nr_spread_over.avg
    164.81 ± 16%     -37.1%     103.71 ± 20%  sched_debug.cfs_rq:/.nr_spread_over.max
     33.80 ± 15%     -39.4%      20.50 ± 13%  sched_debug.cfs_rq:/.nr_spread_over.stddev
    375.78 ±  9%     -24.0%     285.69 ± 12%  sched_debug.cfs_rq:/.runnable_avg.avg
   -183105           -63.3%     -67248        sched_debug.cfs_rq:/.spread0.min
     76034 ± 37%     -56.0%      33448 ± 33%  sched_debug.cfs_rq:/.spread0.stddev
    369.43 ±  9%     -23.5%     282.61 ± 11%  sched_debug.cfs_rq:/.util_avg.avg
    696.88 ±  4%     -20.5%     553.75 ±  9%  sched_debug.cfs_rq:/.util_est_enqueued.max
     93.11 ±  4%     -17.5%      76.85 ±  9%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
    131720 ±  7%     +21.2%     159588 ±  5%  sched_debug.cpu.avg_idle.stddev
    153356 ±  8%     -29.2%     108595 ± 12%  sched_debug.cpu.clock.avg
    153364 ±  8%     -29.2%     108602 ± 12%  sched_debug.cpu.clock.max
    153348 ±  8%     -29.2%     108588 ± 12%  sched_debug.cpu.clock.min
    151791 ±  8%     -29.1%     107548 ± 12%  sched_debug.cpu.clock_task.avg
    152331 ±  8%     -29.2%     107864 ± 12%  sched_debug.cpu.clock_task.max
    146177 ±  8%     -30.1%     102228 ± 12%  sched_debug.cpu.clock_task.min
     10821 ±  8%     -28.9%       7692 ± 12%  sched_debug.cpu.curr->pid.max
      0.32 ± 12%     -27.4%       0.23 ± 19%  sched_debug.cpu.nr_running.avg
      5319 ±  9%     -29.2%       3764 ± 11%  sched_debug.cpu.nr_switches.avg
      2044 ±  8%     -27.5%       1481 ± 12%  sched_debug.cpu.nr_switches.min
     30.00 ± 28%     +60.0%      48.00 ± 15%  sched_debug.cpu.nr_uninterruptible.max
      6.75 ± 14%     +21.5%       8.19 ±  5%  sched_debug.cpu.nr_uninterruptible.stddev
      3776 ± 12%     -41.9%       2194 ± 20%  sched_debug.cpu.sched_count.avg
      1236 ±  9%     -41.2%     727.58 ± 17%  sched_debug.cpu.sched_count.min
      1477 ± 10%     -36.7%     935.58 ± 20%  sched_debug.cpu.sched_goidle.avg
    386.73 ±  9%     -37.5%     241.60 ± 22%  sched_debug.cpu.sched_goidle.min
      1705 ± 12%     -42.8%     975.64 ± 21%  sched_debug.cpu.ttwu_count.avg
     24239 ± 22%     -31.5%      16612 ± 15%  sched_debug.cpu.ttwu_count.max
    522.40 ± 13%     -41.1%     307.79 ± 21%  sched_debug.cpu.ttwu_count.min
      2783 ± 15%     -33.1%       1862 ± 12%  sched_debug.cpu.ttwu_count.stddev
    682.47 ± 15%     -48.1%     354.52 ± 18%  sched_debug.cpu.ttwu_local.avg
      5220 ± 16%     -59.3%       2122 ± 18%  sched_debug.cpu.ttwu_local.max
    282.79 ± 13%     -39.4%     171.29 ± 21%  sched_debug.cpu.ttwu_local.min
    709.90 ± 15%     -57.2%     304.11 ± 13%  sched_debug.cpu.ttwu_local.stddev
    153349 ±  8%     -29.2%     108590 ± 12%  sched_debug.cpu_clk
    152855 ±  8%     -29.3%     108094 ± 12%  sched_debug.ktime
    154041 ±  8%     -29.1%     109253 ± 12%  sched_debug.sched_clk
     25.29 ±119%     -25.3        0.00        perf-profile.calltrace.cycles-pp.asm_exc_page_fault
     24.30 ±119%     -24.3        0.00        perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault
     24.14 ±119%     -24.1        0.00        perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
     23.72 ±119%     -23.7        0.00        perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
     26.28 ± 10%     -14.8       11.48 ±  3%  perf-profile.calltrace.cycles-pp.lru_cache_add.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
     26.12 ± 10%     -14.8       11.32 ±  3%  perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.lru_cache_add.shmem_getpage_gfp.shmem_fault.__do_fault
     23.96 ± 10%     -14.5        9.45 ±  4%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_cache_add.shmem_getpage_gfp
     24.02 ± 10%     -14.5        9.51 ±  4%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_cache_add.shmem_getpage_gfp.shmem_fault
     56.91 ± 12%     -11.9       45.00        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
     56.41 ± 12%     -11.9       44.51        perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
     48.60 ± 12%     -11.6       37.03        perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
     48.89 ± 12%     -11.6       37.32        perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
     48.96 ± 12%     -11.6       37.39        perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      2.75 ± 24%      -2.5        0.30 ±101%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlinkat
      2.75 ± 24%      -2.5        0.30 ±101%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlinkat
      2.75 ± 24%      -2.5        0.30 ±101%  perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlinkat
      2.75 ± 24%      -2.5        0.30 ±101%  perf-profile.calltrace.cycles-pp.evict.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlinkat
      2.75 ± 24%      -2.5        0.30 ±101%  perf-profile.calltrace.cycles-pp.unlinkat
      1.82 ± 18%      -0.6        1.26 ±  2%  perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.82 ± 18%      -0.6        1.26 ±  2%  perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
      1.80 ± 18%      -0.6        1.24 ±  2%  perf-profile.calltrace.cycles-pp.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
      1.81 ± 18%      -0.6        1.25 ±  2%  perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
      1.81 ± 18%      -0.6        1.25 ±  2%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
      2.20 ± 11%      -0.5        1.73        perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      2.15 ± 11%      -0.5        1.67        perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
      1.81 ± 11%      -0.5        1.33        perf-profile.calltrace.cycles-pp.page_add_file_rmap.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
      1.35 ± 16%      -0.3        1.02 ± 10%  perf-profile.calltrace.cycles-pp.ret_from_fork
      1.35 ± 16%      -0.3        1.02 ± 10%  perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
      1.33 ± 16%      -0.3        1.00 ± 10%  perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork
      1.32 ± 16%      -0.3        0.99 ± 10%  perf-profile.calltrace.cycles-pp.drm_fb_helper_dirty_work.process_one_work.worker_thread.kthread.ret_from_fork
      1.33 ± 17%      -0.3        1.00 ± 10%  perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
      1.28 ± 16%      -0.3        0.96 ± 10%  perf-profile.calltrace.cycles-pp.memcpy_erms.drm_fb_helper_dirty_work.process_one_work.worker_thread.kthread
      1.68 ±  7%      -0.2        1.48        perf-profile.calltrace.cycles-pp.__pagevec_lru_add_fn.pagevec_lru_move_fn.lru_cache_add.shmem_getpage_gfp.shmem_fault
      0.62 ± 60%      +0.4        1.05 ±  3%  perf-profile.calltrace.cycles-pp.__irqentry_text_end.do_access
      1.23 ± 61%      +1.2        2.40        perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_exc_page_fault.do_access
      2.24 ± 21%      +1.2        3.48 ±  5%  perf-profile.calltrace.cycles-pp.get_mem_cgroup_from_mm.mem_cgroup_charge.shmem_add_to_page_cache.shmem_getpage_gfp.shmem_fault
      0.87 ±114%      +1.6        2.47 ±  3%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.88 ±114%      +1.6        2.47 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
      9.19 ± 17%      +2.4       11.55 ±  3%  perf-profile.calltrace.cycles-pp.shmem_add_to_page_cache.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
      0.00            +2.4        2.38 ±  3%  perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +2.4        2.38 ±  3%  perf-profile.calltrace.cycles-pp.evict.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      5.64 ± 21%      +2.8        8.47 ±  4%  perf-profile.calltrace.cycles-pp.mem_cgroup_charge.shmem_add_to_page_cache.shmem_getpage_gfp.shmem_fault.__do_fault
     13.37 ± 61%      +9.1       22.42        perf-profile.calltrace.cycles-pp.do_rw_once
     26.17 ± 10%     -14.8       11.35 ±  3%  perf-profile.children.cycles-pp.pagevec_lru_move_fn
     26.28 ± 10%     -14.8       11.48 ±  3%  perf-profile.children.cycles-pp.lru_cache_add
     24.20 ± 10%     -14.5        9.67 ±  4%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     24.12 ± 10%     -14.5        9.62 ±  4%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     56.45 ± 12%     -11.9       44.54        perf-profile.children.cycles-pp.do_fault
     56.95 ± 12%     -11.9       45.05        perf-profile.children.cycles-pp.__handle_mm_fault
     59.12 ± 12%     -11.8       47.29        perf-profile.children.cycles-pp.do_user_addr_fault
     58.05 ± 12%     -11.8       46.22        perf-profile.children.cycles-pp.handle_mm_fault
     59.51 ± 12%     -11.8       47.73        perf-profile.children.cycles-pp.exc_page_fault
     48.62 ± 12%     -11.6       37.05        perf-profile.children.cycles-pp.shmem_getpage_gfp
     48.89 ± 12%     -11.6       37.32        perf-profile.children.cycles-pp.shmem_fault
     48.96 ± 12%     -11.6       37.40        perf-profile.children.cycles-pp.__do_fault
     63.35 ± 10%     -10.5       52.82        perf-profile.children.cycles-pp.asm_exc_page_fault
      2.75 ± 24%      -2.2        0.51 ± 19%  perf-profile.children.cycles-pp.unlinkat
      1.83 ± 18%      -0.6        1.27 ±  2%  perf-profile.children.cycles-pp.unmap_region
      1.84 ± 18%      -0.6        1.27 ±  2%  perf-profile.children.cycles-pp.__do_munmap
      1.83 ± 18%      -0.6        1.26 ±  2%  perf-profile.children.cycles-pp.__vm_munmap
      1.82 ± 18%      -0.6        1.26 ±  2%  perf-profile.children.cycles-pp.__x64_sys_munmap
      1.82 ± 18%      -0.6        1.26 ±  2%  perf-profile.children.cycles-pp.zap_pte_range
      1.82 ± 18%      -0.6        1.27 ±  2%  perf-profile.children.cycles-pp.unmap_page_range
      1.82 ± 18%      -0.6        1.27 ±  2%  perf-profile.children.cycles-pp.unmap_vmas
      1.16 ± 17%      -0.5        0.66 ±  4%  perf-profile.children.cycles-pp.page_remove_rmap
      2.21 ± 11%      -0.5        1.73        perf-profile.children.cycles-pp.finish_fault
      1.82 ± 11%      -0.5        1.34        perf-profile.children.cycles-pp.page_add_file_rmap
      2.56 ± 11%      -0.5        2.09        perf-profile.children.cycles-pp.alloc_set_pte
      1.35 ± 16%      -0.3        1.02 ± 10%  perf-profile.children.cycles-pp.ret_from_fork
      1.35 ± 16%      -0.3        1.02 ± 10%  perf-profile.children.cycles-pp.kthread
      1.33 ± 16%      -0.3        1.00 ± 10%  perf-profile.children.cycles-pp.worker_thread
      1.32 ± 16%      -0.3        0.99 ± 10%  perf-profile.children.cycles-pp.drm_fb_helper_dirty_work
      1.33 ± 17%      -0.3        1.00 ± 10%  perf-profile.children.cycles-pp.process_one_work
      1.32 ± 16%      -0.3        0.99 ± 10%  perf-profile.children.cycles-pp.memcpy_erms
      1.69 ±  7%      -0.2        1.50        perf-profile.children.cycles-pp.__pagevec_lru_add_fn
      0.08 ± 11%      -0.0        0.06 ±  9%  perf-profile.children.cycles-pp.mem_cgroup_page_lruvec
      0.27 ± 12%      +0.1        0.32 ±  6%  perf-profile.children.cycles-pp.xas_create_range
      0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.get_vma_policy
      0.00            +0.2        0.23 ±  6%  perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
      2.25 ± 21%      +1.2        3.49 ±  5%  perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
      9.22 ± 17%      +2.4       11.59 ±  3%  perf-profile.children.cycles-pp.shmem_add_to_page_cache
      5.66 ± 21%      +2.8        8.49 ±  4%  perf-profile.children.cycles-pp.mem_cgroup_charge
     12.88 ± 61%      +9.7       22.58        perf-profile.children.cycles-pp.do_rw_once
     24.12 ± 10%     -14.5        9.62 ±  4%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      1.36 ± 15%      -0.6        0.80        perf-profile.self.cycles-pp.shmem_add_to_page_cache
      0.65 ± 12%      -0.4        0.26 ±  9%  perf-profile.self.cycles-pp.page_add_file_rmap
      1.31 ± 16%      -0.3        0.98 ± 10%  perf-profile.self.cycles-pp.memcpy_erms
      0.48 ± 17%      -0.3        0.17 ±  2%  perf-profile.self.cycles-pp.page_remove_rmap
      1.06 ± 11%      -0.3        0.79        perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
      0.80 ±  7%      -0.2        0.57        perf-profile.self.cycles-pp.__pagevec_lru_add_fn
      0.07 ± 12%      -0.0        0.04 ± 57%  perf-profile.self.cycles-pp.mem_cgroup_page_lruvec
      0.08 ± 15%      -0.0        0.06 ±  6%  perf-profile.self.cycles-pp.truncate_cleanup_page
      0.08 ± 15%      +0.0        0.12 ±  5%  perf-profile.self.cycles-pp.xas_create_range
      0.15 ± 19%      +0.0        0.19 ±  4%  perf-profile.self.cycles-pp.xas_find_conflict
      0.00            +0.1        0.14 ±  3%  perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
      2.23 ± 21%      +1.2        3.46 ±  5%  perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
      2.15 ± 25%      +1.5        3.66 ±  4%  perf-profile.self.cycles-pp.mem_cgroup_charge
      6.88 ± 61%      +4.0       10.87        perf-profile.self.cycles-pp.do_access
     10.47 ± 61%      +8.7       19.20        perf-profile.self.cycles-pp.do_rw_once


                                                                                
                               vm-scalability.throughput                        
                                                                                
    5e+07 +-----------------------------------------------------------------+   
          |     O          O               O                      O         |   
  4.9e+07 |-+         O O                                O  O   O   O       |   
  4.8e+07 |-O O   O O        O   O O O O O   O O              O             |   
          |                    O                   O O O                    |   
  4.7e+07 |-+                                    O                          |   
  4.6e+07 |.+. .+.+                                                         |   
          |   +    +                                                        |   
  4.5e+07 |-+       +      +.                           .+..   .+           |   
  4.4e+07 |-+        +   ..  +.                       .+     .+  :   .+. .+.|   
          |           +.+      +.               .+   +      +    : .+   +   |   
  4.3e+07 |-+                    +.+.+.+   +   +  + +             +         |   
  4.2e+07 |-+                           : : + +    +                        |   
          |                             : :  +                              |   
  4.1e+07 +-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.8.0-12308-g170b04b7ae496" of type "text/plain" (170150 bytes)

View attachment "job-script" of type "text/plain" (7566 bytes)

View attachment "job.yaml" of type "text/plain" (5290 bytes)

View attachment "reproduce" of type "text/plain" (345948 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ