lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20190620015813.GM7221@shao2-debian>
Date:   Thu, 20 Jun 2019 09:58:13 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        kernel test robot <rong.a.chen@...el.com>,
        Michal Hocko <mhocko@...nel.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Roman Gushchin <guro@...com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [mm] 815744d751: will-it-scale.per_process_ops 43.3% improvement

Greeting,

FYI, we noticed a 43.3% improvement of will-it-scale.per_process_ops due to commit:


commit: 815744d75152078cde5391fc1e3c2d4424323fb6 ("mm: memcontrol: don't batch updates of local VM stats and events")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
with following parameters:

	nr_task: 100%
	mode: process
	test: page_fault3
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-2019-05-14.cgz/lkp-csl-2ap1/page_fault3/will-it-scale

commit: 
  c11fb13a11 ("Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid")
  815744d751 ("mm: memcontrol: don't batch updates of local VM stats and events")

c11fb13a117e5a67 815744d75152078cde5391fc1e3 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
         16:4          183%          23:4     perf-profile.calltrace.cycles-pp.sync_regs.error_entry.testcase
         18:4          204%          26:4     perf-profile.calltrace.cycles-pp.error_entry.testcase
          0:4            3%           0:4     perf-profile.children.cycles-pp.error_exit
         19:4          216%          28:4     perf-profile.children.cycles-pp.error_entry
          0:4            2%           0:4     perf-profile.self.cycles-pp.error_exit
          2:4           24%           3:4     perf-profile.self.cycles-pp.error_entry
         %stddev     %change         %stddev
             \          |                \  
    470022           +43.3%     673338        will-it-scale.per_process_ops
  90244291           +43.3%  1.293e+08        will-it-scale.workload
     33.13            -2.1%      32.44        boot-time.dhcp
     13.47 ±  6%      -1.0       12.47        mpstat.cpu.all.usr%
      0.14 ± 10%     +76.8%       0.25 ±  7%  turbostat.CPU%c1
    609.72            +9.7%     668.75        turbostat.PkgWatt
     22201 ±  2%     -11.1%      19733 ±  3%  numa-meminfo.node2.Inactive
     22201 ±  2%     -12.0%      19545 ±  4%  numa-meminfo.node2.Inactive(anon)
      6060 ±  4%     +15.5%       7000 ±  3%  numa-meminfo.node2.KernelStack
     30791 ± 32%     -47.1%      16275 ± 22%  numa-meminfo.node3.Inactive
     30675 ± 33%     -47.5%      16090 ± 22%  numa-meminfo.node3.Inactive(anon)
     15906            -1.4%      15683        proc-vmstat.nr_page_table_pages
  60975063           +38.1%   84216668        proc-vmstat.numa_hit
  60881605           +38.2%   84123351        proc-vmstat.numa_local
  61181699           +38.1%   84469061        proc-vmstat.pgalloc_normal
 2.713e+10           +43.3%  3.887e+10        proc-vmstat.pgfault
  58818843 ±  5%     +38.1%   81212599 ±  4%  proc-vmstat.pgfree
  14513924 ±  2%     +41.8%   20585689 ±  2%  numa-numastat.node0.local_node
  14537287 ±  2%     +41.7%   20601299 ±  2%  numa-numastat.node0.numa_hit
  15413984 ±  2%     +37.3%   21168399        numa-numastat.node1.local_node
  15437340 ±  2%     +37.3%   21191634        numa-numastat.node1.numa_hit
  15397026           +38.9%   21394060        numa-numastat.node2.local_node
  15428108           +38.8%   21417417        numa-numastat.node2.numa_hit
  15632711           +35.1%   21119563        numa-numastat.node3.local_node
  15648357           +35.2%   21150655        numa-numastat.node3.numa_hit
    444.19 ± 12%     +41.5%     628.34 ±  3%  sched_debug.cfs_rq:/.exec_clock.stddev
    172846 ±  7%     +19.1%     205841        sched_debug.cfs_rq:/.min_vruntime.stddev
      0.05 ±  8%     -22.1%       0.04 ±  5%  sched_debug.cfs_rq:/.nr_running.stddev
    171875 ±  7%     +19.2%     204919        sched_debug.cfs_rq:/.spread0.stddev
    280585 ± 89%     -93.9%      17176 ± 64%  sched_debug.cpu.avg_idle.min
     24193 ± 31%     +61.1%      38975 ± 24%  sched_debug.cpu.nr_switches.max
      1136 ± 13%     +34.4%       1527 ± 16%  sched_debug.cpu.ttwu_count.stddev
      1018 ± 15%     +44.2%       1467 ± 18%  sched_debug.cpu.ttwu_local.stddev
   8487551 ±  2%     +35.9%   11538420 ±  2%  numa-vmstat.node0.numa_hit
   8464705 ±  2%     +36.1%   11523132 ±  2%  numa-vmstat.node0.numa_local
   8798431           +32.5%   11661199        numa-vmstat.node1.numa_hit
   8689509 ±  2%     +32.9%   11552294        numa-vmstat.node1.numa_local
      5568 ±  2%     -12.9%       4847 ±  5%  numa-vmstat.node2.nr_inactive_anon
      6057 ±  4%     +15.6%       7001 ±  3%  numa-vmstat.node2.nr_kernel_stack
      5571 ±  2%     -12.9%       4851 ±  5%  numa-vmstat.node2.nr_zone_inactive_anon
   8648618           +35.6%   11724115        numa-vmstat.node2.numa_hit
   8532135           +36.1%   11615170        numa-vmstat.node2.numa_local
      7672 ± 33%     -48.1%       3980 ± 17%  numa-vmstat.node3.nr_inactive_anon
      7673 ± 33%     -48.1%       3983 ± 17%  numa-vmstat.node3.nr_zone_inactive_anon
   8891497           +30.8%   11626904        numa-vmstat.node3.numa_hit
   8789979           +30.9%   11510256        numa-vmstat.node3.numa_local
    130.25 ±165%     -98.7%       1.75 ± 74%  interrupts.CPU109.RES:Rescheduling_interrupts
    131.00 ±137%     -97.7%       3.00 ± 97%  interrupts.CPU118.RES:Rescheduling_interrupts
    598.75 ±121%     -93.5%      38.75 ± 62%  interrupts.CPU13.RES:Rescheduling_interrupts
     10.25 ±156%   +5961.0%     621.25 ±167%  interrupts.CPU145.RES:Rescheduling_interrupts
    688.50 ±129%     -95.4%      31.50 ±107%  interrupts.CPU16.RES:Rescheduling_interrupts
      1.25 ± 34%  +10960.0%     138.25 ±113%  interrupts.CPU161.RES:Rescheduling_interrupts
    779.50 ±149%     -94.9%      39.75 ± 92%  interrupts.CPU17.RES:Rescheduling_interrupts
    104.25 ± 69%     -93.8%       6.50 ± 35%  interrupts.CPU177.RES:Rescheduling_interrupts
    100.25 ± 90%     -89.3%      10.75 ±118%  interrupts.CPU182.RES:Rescheduling_interrupts
    494.25 ± 60%     -79.0%     103.75 ± 16%  interrupts.CPU2.RES:Rescheduling_interrupts
      4480 ± 16%     -40.5%       2665 ± 55%  interrupts.CPU24.CAL:Function_call_interrupts
     19.50 ±152%    +928.2%     200.50 ± 91%  interrupts.CPU29.RES:Rescheduling_interrupts
      5259 ± 34%     +59.7%       8397        interrupts.CPU39.NMI:Non-maskable_interrupts
      5259 ± 34%     +59.7%       8397        interrupts.CPU39.PMI:Performance_monitoring_interrupts
      5258 ± 34%     +59.6%       8390        interrupts.CPU42.NMI:Non-maskable_interrupts
      5258 ± 34%     +59.6%       8390        interrupts.CPU42.PMI:Performance_monitoring_interrupts
      5253 ± 34%     +59.8%       8393        interrupts.CPU43.NMI:Non-maskable_interrupts
      5253 ± 34%     +59.8%       8393        interrupts.CPU43.PMI:Performance_monitoring_interrupts
      5248 ± 34%     +59.9%       8394        interrupts.CPU44.NMI:Non-maskable_interrupts
      5248 ± 34%     +59.9%       8394        interrupts.CPU44.PMI:Performance_monitoring_interrupts
      5261 ± 34%     +60.0%       8419        interrupts.CPU57.NMI:Non-maskable_interrupts
      5261 ± 34%     +60.0%       8419        interrupts.CPU57.PMI:Performance_monitoring_interrupts
      7874           +20.1%       9459 ± 11%  interrupts.CPU95.RES:Rescheduling_interrupts
    217.00 ± 49%     -80.2%      43.00 ± 93%  interrupts.CPU96.RES:Rescheduling_interrupts
     53003 ±  5%     -13.3%      45949 ±  6%  interrupts.RES:Rescheduling_interrupts
      2.68           -31.4%       1.84        perf-stat.i.MPKI
 4.694e+10           +43.1%  6.715e+10        perf-stat.i.branch-instructions
 1.238e+08           +43.4%  1.775e+08 ±  2%  perf-stat.i.branch-misses
     59.57            +9.1       68.63        perf-stat.i.cache-miss-rate%
 3.661e+08           +14.0%  4.173e+08        perf-stat.i.cache-misses
      2.49           -31.0%       1.72        perf-stat.i.cpi
      1557           -12.7%       1360        perf-stat.i.cycles-between-cache-misses
      0.02 ±109%      -0.0        0.00 ± 13%  perf-stat.i.dTLB-load-miss-rate%
 6.494e+10           +44.9%  9.411e+10        perf-stat.i.dTLB-loads
  1.59e+09           +43.4%   2.28e+09        perf-stat.i.dTLB-store-misses
  3.43e+10           +44.8%  4.965e+10        perf-stat.i.dTLB-stores
  2.29e+11           +44.2%  3.301e+11        perf-stat.i.instructions
      0.40           +44.8%       0.58        perf-stat.i.ipc
  90220636           +43.3%  1.293e+08        perf-stat.i.minor-faults
      6.99 ±  2%      -6.0        0.96 ± 17%  perf-stat.i.node-load-miss-rate%
   2563994 ±  2%     -67.3%     839546 ± 12%  perf-stat.i.node-load-misses
  34196685 ±  2%    +154.9%   87152794 ±  5%  perf-stat.i.node-loads
     14.00            -7.5        6.48        perf-stat.i.node-store-miss-rate%
  14863715 ±  2%     -38.3%    9165413 ±  2%  perf-stat.i.node-store-misses
  91337940           +44.8%  1.323e+08        perf-stat.i.node-stores
  90221535           +43.3%  1.293e+08        perf-stat.i.page-faults
      2.68           -31.4%       1.84        perf-stat.overall.MPKI
     59.57            +9.1       68.63        perf-stat.overall.cache-miss-rate%
      2.49           -31.0%       1.72        perf-stat.overall.cpi
      1557           -12.7%       1359        perf-stat.overall.cycles-between-cache-misses
      0.02 ±109%      -0.0        0.00 ± 13%  perf-stat.overall.dTLB-load-miss-rate%
      0.40           +44.8%       0.58        perf-stat.overall.ipc
      6.98 ±  2%      -6.0        0.96 ± 17%  perf-stat.overall.node-load-miss-rate%
     14.00            -7.5        6.48        perf-stat.overall.node-store-miss-rate%
 4.677e+10           +43.1%  6.692e+10        perf-stat.ps.branch-instructions
 1.234e+08           +43.4%  1.769e+08 ±  2%  perf-stat.ps.branch-misses
 3.647e+08           +14.0%  4.158e+08        perf-stat.ps.cache-misses
 6.471e+10           +45.0%  9.379e+10        perf-stat.ps.dTLB-loads
 1.584e+09           +43.5%  2.273e+09        perf-stat.ps.dTLB-store-misses
 3.417e+10           +44.8%  4.949e+10        perf-stat.ps.dTLB-stores
 2.281e+11           +44.2%   3.29e+11        perf-stat.ps.instructions
  89897978           +43.3%  1.288e+08        perf-stat.ps.minor-faults
   2554791 ±  2%     -67.2%     836695 ± 12%  perf-stat.ps.node-load-misses
  34073965 ±  2%    +154.9%   86855995 ±  5%  perf-stat.ps.node-loads
  14810461 ±  2%     -38.3%    9134179 ±  2%  perf-stat.ps.node-store-misses
  91010555           +44.9%  1.318e+08        perf-stat.ps.node-stores
  89898192           +43.3%  1.288e+08        perf-stat.ps.page-faults
 6.812e+13           +44.2%  9.822e+13        perf-stat.total.instructions
     11934 ±  6%     +35.0%      16111 ± 12%  softirqs.CPU120.RCU
     11745 ±  4%     +14.5%      13450 ±  5%  softirqs.CPU122.RCU
     11990 ±  7%     +11.6%      13378 ±  4%  softirqs.CPU124.RCU
     11979 ±  5%     +12.4%      13466 ±  4%  softirqs.CPU126.RCU
     11997 ±  6%     +11.4%      13370 ±  5%  softirqs.CPU127.RCU
     12165 ±  3%     +12.4%      13677 ±  4%  softirqs.CPU128.RCU
     12213 ±  4%     +10.0%      13431 ±  4%  softirqs.CPU129.RCU
     11827 ±  4%     +16.9%      13831 ±  5%  softirqs.CPU130.RCU
     11469 ±  9%     +19.7%      13725 ±  4%  softirqs.CPU131.RCU
     11869 ±  5%     +15.5%      13711 ±  3%  softirqs.CPU132.RCU
     11751 ±  4%     +15.7%      13596 ±  4%  softirqs.CPU134.RCU
     11675 ±  5%     +16.6%      13615 ±  6%  softirqs.CPU135.RCU
     11900 ±  5%     +15.0%      13687 ±  3%  softirqs.CPU136.RCU
     11959 ±  5%     +14.0%      13636 ±  4%  softirqs.CPU137.RCU
     11940 ±  5%     +13.7%      13576 ±  3%  softirqs.CPU138.RCU
     11905 ±  6%     +15.9%      13804 ±  5%  softirqs.CPU139.RCU
     12342 ±  5%     +11.4%      13750 ±  6%  softirqs.CPU140.RCU
     11828 ±  4%     +13.3%      13401 ±  3%  softirqs.CPU141.RCU
     11823 ±  4%     +14.5%      13536 ±  3%  softirqs.CPU142.RCU
     11658 ±  7%     +15.6%      13472 ±  3%  softirqs.CPU143.RCU
    133947 ± 19%     -23.1%     102992 ±  3%  softirqs.CPU143.TIMER
     12294 ±  3%      +6.5%      13090 ±  3%  softirqs.CPU145.RCU
     12118 ±  3%      +7.3%      12999        softirqs.CPU146.RCU
     12079 ±  3%      +9.6%      13240 ±  2%  softirqs.CPU149.RCU
     11937 ±  3%     +11.0%      13256 ±  2%  softirqs.CPU155.RCU
     12003 ±  3%     +11.2%      13348 ±  4%  softirqs.CPU156.RCU
     11979 ±  6%      +9.1%      13075 ±  4%  softirqs.CPU158.RCU
     11992 ±  3%      +9.6%      13148 ±  4%  softirqs.CPU159.RCU
     12283 ±  5%     +14.0%      13997 ±  9%  softirqs.CPU167.RCU
     11803           +12.4%      13267 ±  3%  softirqs.CPU180.RCU
     12018 ±  5%      +6.8%      12838 ±  4%  softirqs.CPU187.RCU
     12493 ±  5%     +13.6%      14192 ±  4%  softirqs.CPU27.RCU
     12587 ±  6%     +13.8%      14328 ±  6%  softirqs.CPU30.RCU
     12864 ±  3%      +9.6%      14103 ±  4%  softirqs.CPU33.RCU
     12555 ±  4%     +12.9%      14181 ±  6%  softirqs.CPU34.RCU
     12422 ±  4%     +17.1%      14545 ±  4%  softirqs.CPU35.RCU
     12235           +17.1%      14328 ±  2%  softirqs.CPU36.RCU
     12710 ±  5%     +10.1%      13989 ±  3%  softirqs.CPU37.RCU
     12441 ±  4%     +14.6%      14262 ±  3%  softirqs.CPU38.RCU
     12457 ±  4%     +12.4%      14000 ±  3%  softirqs.CPU39.RCU
     12503 ±  5%     +13.5%      14188 ±  2%  softirqs.CPU40.RCU
     12430 ±  4%     +15.9%      14408 ±  3%  softirqs.CPU41.RCU
     12494 ±  5%     +14.5%      14310 ±  3%  softirqs.CPU42.RCU
     12776 ±  4%     +13.9%      14547 ±  5%  softirqs.CPU43.RCU
     12466 ±  4%     +12.6%      14040 ±  2%  softirqs.CPU45.RCU
     12361 ±  5%     +15.8%      14313 ±  3%  softirqs.CPU46.RCU
     11235 ±  8%     +26.8%      14244 ±  2%  softirqs.CPU47.RCU
    118604 ±  8%     -12.5%     103803 ±  2%  softirqs.CPU47.TIMER
     12832 ±  4%      +8.0%      13857 ±  5%  softirqs.CPU62.RCU
     12303 ±  4%      +7.4%      13208 ±  7%  softirqs.CPU73.RCU
     12603 ±  5%     +19.3%      15040 ±  9%  softirqs.CPU8.RCU
     10873 ± 18%     +24.4%      13522 ±  3%  softirqs.CPU81.RCU
     12259            +9.4%      13412 ±  3%  softirqs.CPU83.RCU
     12381 ±  2%      +9.2%      13517 ±  4%  softirqs.CPU87.RCU
     12363 ±  2%      +8.7%      13440 ±  3%  softirqs.CPU88.RCU
     12498 ±  2%      +9.3%      13656 ±  3%  softirqs.CPU89.RCU
     12202 ±  2%      +9.5%      13359 ±  5%  softirqs.CPU92.RCU
     12291 ±  3%     +10.9%      13629 ±  3%  softirqs.CPU94.RCU
     15.61            -9.2        6.40        perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault
     15.89            -9.1        6.81        perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
     40.86            -8.9       31.99        perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault.testcase
     12.32            -8.8        3.57 ±  3%  perf-profile.calltrace.cycles-pp.page_add_file_rmap.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault
     47.45            -8.1       39.30        perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault.testcase
     49.07            -7.6       41.47        perf-profile.calltrace.cycles-pp.do_page_fault.page_fault.testcase
     78.70            -7.5       71.19        perf-profile.calltrace.cycles-pp.testcase
      9.92            -6.2        3.72 ± 16%  perf-profile.calltrace.cycles-pp.page_remove_rmap.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
      6.82 ±  5%      -5.5        1.31 ±  8%  perf-profile.calltrace.cycles-pp.__count_memcg_events.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
      6.45 ±  3%      -5.2        1.23 ± 25%  perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_lruvec_state.page_remove_rmap.unmap_page_range.unmap_vmas
      6.23 ±  4%      -5.2        1.06 ±  3%  perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_lruvec_state.page_add_file_rmap.alloc_set_pte.finish_fault
      7.19 ±  2%      -5.0        2.14 ± 18%  perf-profile.calltrace.cycles-pp.__mod_lruvec_state.page_remove_rmap.unmap_page_range.unmap_vmas.unmap_region
      7.00 ±  3%      -5.0        2.04 ±  2%  perf-profile.calltrace.cycles-pp.__mod_lruvec_state.page_add_file_rmap.alloc_set_pte.finish_fault.__handle_mm_fault
     32.31            -4.0       28.28        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
     14.76            -3.8       10.98 ±  9%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
     14.80            -3.8       11.04 ±  9%  perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
     14.83            -3.8       11.07 ±  9%  perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
     14.83            -3.8       11.07 ±  9%  perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
     14.83            -3.8       11.07 ±  9%  perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
     14.83            -3.8       11.08 ±  9%  perf-profile.calltrace.cycles-pp.munmap
     14.82            -3.8       11.07 ±  9%  perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
     14.83            -3.8       11.08 ±  9%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.munmap
     14.83            -3.8       11.08 ±  9%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
      3.96 ±  6%      -3.7        0.27 ±100%  perf-profile.calltrace.cycles-pp.lock_page_memcg.page_add_file_rmap.alloc_set_pte.finish_fault.__handle_mm_fault
      1.72            -0.9        0.81        perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault
      1.30 ±  3%      -0.6        0.69        perf-profile.calltrace.cycles-pp.down_read_trylock.__do_page_fault.do_page_fault.page_fault.testcase
      1.31            -0.6        0.71        perf-profile.calltrace.cycles-pp.up_read.__do_page_fault.do_page_fault.page_fault.testcase
      0.84            -0.2        0.64        perf-profile.calltrace.cycles-pp.unlock_page.fault_dirty_shared_page.__handle_mm_fault.handle_mm_fault.__do_page_fault
      0.55            +0.1        0.67 ±  4%  perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.unmap_page_range.unmap_vmas.unmap_region
      0.53 ±  3%      +0.2        0.72 ±  2%  perf-profile.calltrace.cycles-pp.current_time.file_update_time.__handle_mm_fault.handle_mm_fault.__do_page_fault
      0.70            +0.2        0.90 ±  3%  perf-profile.calltrace.cycles-pp.tlb_flush_mmu.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
      1.91            +0.3        2.18        perf-profile.calltrace.cycles-pp.fault_dirty_shared_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
      0.53 ±  2%      +0.3        0.82 ±  4%  perf-profile.calltrace.cycles-pp.vmacache_find.find_vma.__do_page_fault.do_page_fault.page_fault
      0.92            +0.3        1.24 ±  3%  perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.do_page_fault.page_fault.testcase
      0.60 ±  2%      +0.3        0.93 ±  4%  perf-profile.calltrace.cycles-pp.find_vma.__do_page_fault.do_page_fault.page_fault.testcase
      0.92            +0.3        1.25        perf-profile.calltrace.cycles-pp.file_update_time.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
      0.41 ± 57%      +0.4        0.77 ±  3%  perf-profile.calltrace.cycles-pp.set_page_dirty.fault_dirty_shared_page.__handle_mm_fault.handle_mm_fault.__do_page_fault
      0.96            +0.4        1.34        perf-profile.calltrace.cycles-pp.swapgs_restore_regs_and_return_to_usermode.testcase
      1.37            +0.4        1.80        perf-profile.calltrace.cycles-pp.__perf_sw_event.do_page_fault.page_fault.testcase
      0.84 ±  3%      +0.5        1.30 ±  2%  perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.__do_page_fault.do_page_fault.page_fault
      0.00            +0.6        0.55 ±  2%  perf-profile.calltrace.cycles-pp.set_page_dirty.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
      0.00            +0.6        0.56 ±  4%  perf-profile.calltrace.cycles-pp.__mod_node_page_state.__mod_lruvec_state.page_add_file_rmap.alloc_set_pte.finish_fault
      0.00            +0.6        0.56 ±  2%  perf-profile.calltrace.cycles-pp.page_mapping.set_page_dirty.fault_dirty_shared_page.__handle_mm_fault.handle_mm_fault
      1.42 ±  2%      +0.7        2.12        perf-profile.calltrace.cycles-pp.__perf_sw_event.__do_page_fault.do_page_fault.page_fault.testcase
      1.74            +0.8        2.55 ±  2%  perf-profile.calltrace.cycles-pp.xas_load.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault
      8.78            +2.7       11.48 ±  3%  perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault
      9.54            +2.9       12.42 ±  3%  perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault
     10.25            +3.1       13.40 ±  3%  perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault
     10.57            +3.2       13.80 ±  3%  perf-profile.calltrace.cycles-pp.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
      4.94 ±  2%      +4.4        9.29 ±  4%  perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault
     84.06           +12.0       96.02        perf-profile.calltrace.cycles-pp.page_fault.testcase
     12.70 ±  3%     -10.4        2.31 ± 13%  perf-profile.children.cycles-pp.__mod_memcg_state
     14.21 ±  2%     -10.0        4.23 ± 10%  perf-profile.children.cycles-pp.__mod_lruvec_state
     15.70            -9.2        6.54        perf-profile.children.cycles-pp.alloc_set_pte
     15.93            -9.1        6.86        perf-profile.children.cycles-pp.finish_fault
     40.99            -8.8       32.19        perf-profile.children.cycles-pp.handle_mm_fault
     12.36            -8.8        3.60 ±  3%  perf-profile.children.cycles-pp.page_add_file_rmap
     47.57            -8.1       39.47        perf-profile.children.cycles-pp.__do_page_fault
     49.12            -7.6       41.53        perf-profile.children.cycles-pp.do_page_fault
      9.96            -6.2        3.77 ± 16%  perf-profile.children.cycles-pp.page_remove_rmap
      6.82 ±  5%      -5.5        1.32 ±  8%  perf-profile.children.cycles-pp.__count_memcg_events
      5.18 ±  6%      -4.4        0.77 ± 14%  perf-profile.children.cycles-pp.lock_page_memcg
     32.44            -4.0       28.45        perf-profile.children.cycles-pp.__handle_mm_fault
     14.80            -3.8       11.04 ±  9%  perf-profile.children.cycles-pp.unmap_vmas
     14.80            -3.8       11.04 ±  9%  perf-profile.children.cycles-pp.unmap_page_range
     14.88            -3.8       11.12 ±  9%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     14.88            -3.8       11.12 ±  9%  perf-profile.children.cycles-pp.do_syscall_64
     14.83            -3.8       11.07 ±  9%  perf-profile.children.cycles-pp.__x64_sys_munmap
     14.83            -3.8       11.07 ±  9%  perf-profile.children.cycles-pp.__do_munmap
     14.83            -3.8       11.07 ±  9%  perf-profile.children.cycles-pp.__vm_munmap
     14.82            -3.8       11.07 ±  9%  perf-profile.children.cycles-pp.unmap_region
     14.83            -3.8       11.08 ±  9%  perf-profile.children.cycles-pp.munmap
     10.73 ±  8%      -1.6        9.09        perf-profile.children.cycles-pp.native_irq_return_iret
      1.76            -0.9        0.86        perf-profile.children.cycles-pp._raw_spin_lock
      1.31            -0.6        0.71        perf-profile.children.cycles-pp.up_read
      1.31 ±  3%      -0.6        0.72        perf-profile.children.cycles-pp.down_read_trylock
      0.84            -0.2        0.64        perf-profile.children.cycles-pp.unlock_page
      0.37 ±  5%      -0.2        0.18 ±  4%  perf-profile.children.cycles-pp.__unlock_page_memcg
      0.05 ±  9%      +0.0        0.07        perf-profile.children.cycles-pp.p4d_offset
      0.06 ±  6%      +0.0        0.09 ±  4%  perf-profile.children.cycles-pp.get_page_from_freelist
      0.07            +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.pte_alloc_one
      0.07 ±  6%      +0.0        0.10 ±  5%  perf-profile.children.cycles-pp.__alloc_pages_nodemask
      0.07 ±  5%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
      0.04 ± 57%      +0.0        0.07 ±  6%  perf-profile.children.cycles-pp.prep_new_page
      0.03 ±100%      +0.0        0.07 ±  7%  perf-profile.children.cycles-pp.clear_page_erms
      0.10 ±  4%      +0.0        0.14 ± 10%  perf-profile.children.cycles-pp.task_tick_fair
      0.01 ±173%      +0.0        0.05 ±  9%  perf-profile.children.cycles-pp.unlock_page_memcg
      0.03 ±100%      +0.0        0.07 ± 10%  perf-profile.children.cycles-pp.native_set_pte_at
      0.10            +0.0        0.15 ±  3%  perf-profile.children.cycles-pp.PageHuge
      0.12 ±  6%      +0.0        0.17 ± 14%  perf-profile.children.cycles-pp.scheduler_tick
      0.15 ±  8%      +0.0        0.20 ±  2%  perf-profile.children.cycles-pp._vm_normal_page
      0.13 ±  3%      +0.1        0.18        perf-profile.children.cycles-pp.page_rmapping
      0.14 ±  3%      +0.1        0.19 ±  8%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
      0.16 ±  2%      +0.1        0.21 ±  4%  perf-profile.children.cycles-pp.fpregs_assert_state_consistent
      0.12 ±  4%      +0.1        0.18 ±  2%  perf-profile.children.cycles-pp.rcu_all_qs
      0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.hrtimer_active
      0.17 ±  6%      +0.1        0.22 ± 15%  perf-profile.children.cycles-pp.tick_sched_handle
      0.13 ±  3%      +0.1        0.19 ±  2%  perf-profile.children.cycles-pp.pmd_pfn
      0.16 ±  6%      +0.1        0.22 ± 14%  perf-profile.children.cycles-pp.update_process_times
      0.17 ±  2%      +0.1        0.24        perf-profile.children.cycles-pp.pmd_page_vaddr
      0.21 ±  8%      +0.1        0.27 ±  2%  perf-profile.children.cycles-pp.perf_exclude_event
      0.11 ± 23%      +0.1        0.17 ± 11%  perf-profile.children.cycles-pp.timespec64_trunc
      0.09 ±  4%      +0.1        0.16 ± 32%  perf-profile.children.cycles-pp.mem_cgroup_from_task
      0.22 ±  7%      +0.1        0.30 ± 15%  perf-profile.children.cycles-pp.tick_sched_timer
      0.15 ±  3%      +0.1        0.24        perf-profile.children.cycles-pp.free_pages_and_swap_cache
      0.20 ±  2%      +0.1        0.29 ±  2%  perf-profile.children.cycles-pp.mark_page_accessed
      0.20 ± 14%      +0.1        0.29 ± 15%  perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
      0.25 ±  4%      +0.1        0.34        perf-profile.children.cycles-pp.__might_sleep
      0.12 ±  4%      +0.1        0.23 ±  2%  perf-profile.children.cycles-pp.__tlb_remove_page_size
      0.27 ±  4%      +0.1        0.37 ± 12%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.23 ±  2%      +0.1        0.34 ±  2%  perf-profile.children.cycles-pp._cond_resched
      0.36 ±  2%      +0.1        0.48 ±  2%  perf-profile.children.cycles-pp.prepare_exit_to_usermode
      0.31 ±  4%      +0.1        0.43 ±  4%  perf-profile.children.cycles-pp.xas_start
      0.56            +0.1        0.69 ±  3%  perf-profile.children.cycles-pp.release_pages
      0.66            +0.1        0.79        perf-profile.children.cycles-pp.___might_sleep
      0.38            +0.1        0.51        perf-profile.children.cycles-pp.__set_page_dirty_no_writeback
      0.69 ±  4%      +0.1        0.83 ± 11%  perf-profile.children.cycles-pp.apic_timer_interrupt
      0.58 ±  4%      +0.2        0.75 ± 12%  perf-profile.children.cycles-pp.hrtimer_interrupt
      0.61 ±  4%      +0.2        0.79 ± 11%  perf-profile.children.cycles-pp.smp_apic_timer_interrupt
      0.56 ±  2%      +0.2        0.75 ±  2%  perf-profile.children.cycles-pp.current_time
      0.59 ±  4%      +0.2        0.78 ±  5%  perf-profile.children.cycles-pp.pmd_devmap_trans_unstable
      0.71            +0.2        0.92 ±  3%  perf-profile.children.cycles-pp.tlb_flush_mmu
      1.97            +0.3        2.27        perf-profile.children.cycles-pp.fault_dirty_shared_page
      0.55 ±  2%      +0.3        0.85 ±  4%  perf-profile.children.cycles-pp.vmacache_find
      1.17 ±  2%      +0.3        1.48        perf-profile.children.cycles-pp.page_mapping
      0.94            +0.3        1.28        perf-profile.children.cycles-pp.file_update_time
      0.72 ±  2%      +0.3        1.06 ±  9%  perf-profile.children.cycles-pp.__mod_node_page_state
      0.63 ±  2%      +0.3        0.97 ±  3%  perf-profile.children.cycles-pp.find_vma
      0.96            +0.4        1.34        perf-profile.children.cycles-pp.swapgs_restore_regs_and_return_to_usermode
      0.93 ±  2%      +0.4        1.38 ±  2%  perf-profile.children.cycles-pp.set_page_dirty
      1.89 ±  2%      +0.8        2.68        perf-profile.children.cycles-pp.___perf_sw_event
      1.79            +0.8        2.61        perf-profile.children.cycles-pp.xas_load
      2.81            +1.1        3.94        perf-profile.children.cycles-pp.__perf_sw_event
      4.28            +1.9        6.15        perf-profile.children.cycles-pp.sync_regs
     67.10            +2.4       69.48        perf-profile.children.cycles-pp.page_fault
      8.85            +2.7       11.58 ±  3%  perf-profile.children.cycles-pp.find_lock_entry
      9.56            +2.9       12.45 ±  3%  perf-profile.children.cycles-pp.shmem_getpage_gfp
     10.27            +3.1       13.42 ±  3%  perf-profile.children.cycles-pp.shmem_fault
     10.58            +3.2       13.81 ±  3%  perf-profile.children.cycles-pp.__do_fault
     85.11            +3.8       88.87        perf-profile.children.cycles-pp.testcase
      4.97 ±  2%      +4.4        9.34 ±  4%  perf-profile.children.cycles-pp.find_get_entry
     12.60 ±  3%     -10.3        2.25 ± 14%  perf-profile.self.cycles-pp.__mod_memcg_state
      6.80 ±  5%      -5.5        1.29 ±  8%  perf-profile.self.cycles-pp.__count_memcg_events
      5.11 ±  6%      -4.4        0.71 ± 14%  perf-profile.self.cycles-pp.lock_page_memcg
      2.84 ±  2%      -1.7        1.09        perf-profile.self.cycles-pp.find_lock_entry
     10.73 ±  8%      -1.6        9.08        perf-profile.self.cycles-pp.native_irq_return_iret
      1.73            -0.9        0.83        perf-profile.self.cycles-pp._raw_spin_lock
      1.29 ±  4%      -0.6        0.68        perf-profile.self.cycles-pp.down_read_trylock
      1.30            -0.6        0.70        perf-profile.self.cycles-pp.up_read
      1.37            -0.3        1.05 ±  3%  perf-profile.self.cycles-pp.page_add_file_rmap
      0.82            -0.2        0.61        perf-profile.self.cycles-pp.unlock_page
      0.35 ±  4%      -0.2        0.17 ±  3%  perf-profile.self.cycles-pp.__unlock_page_memcg
      0.05            +0.0        0.07 ±  5%  perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
      0.14 ±  6%      +0.0        0.16 ± 10%  perf-profile.self.cycles-pp.perf_swevent_event
      0.08            +0.0        0.11 ±  3%  perf-profile.self.cycles-pp.PageHuge
      0.11 ±  4%      +0.0        0.14        perf-profile.self.cycles-pp.page_rmapping
      0.09 ±  4%      +0.0        0.13        perf-profile.self.cycles-pp.find_vma
      0.03 ±100%      +0.0        0.07 ±  7%  perf-profile.self.cycles-pp.clear_page_erms
      0.22 ±  6%      +0.0        0.26        perf-profile.self.cycles-pp.__do_fault
      0.09            +0.0        0.13 ±  3%  perf-profile.self.cycles-pp.rcu_all_qs
      0.16 ±  2%      +0.0        0.21 ±  3%  perf-profile.self.cycles-pp.prepare_exit_to_usermode
      0.01 ±173%      +0.0        0.06 ± 14%  perf-profile.self.cycles-pp.native_set_pte_at
      0.13 ±  8%      +0.0        0.18 ±  3%  perf-profile.self.cycles-pp._vm_normal_page
      0.13 ±  5%      +0.0        0.18 ±  9%  perf-profile.self.cycles-pp.ktime_get_update_offsets_now
      0.17 ±  9%      +0.0        0.22        perf-profile.self.cycles-pp.perf_exclude_event
      0.11            +0.1        0.16 ±  4%  perf-profile.self.cycles-pp._cond_resched
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.unlock_page_memcg
      0.12 ±  3%      +0.1        0.17 ±  2%  perf-profile.self.cycles-pp.pmd_pfn
      0.00            +0.1        0.05 ±  8%  perf-profile.self.cycles-pp.pmd_devmap
      0.15 ±  5%      +0.1        0.21 ±  4%  perf-profile.self.cycles-pp.fpregs_assert_state_consistent
      0.16 ±  2%      +0.1        0.22        perf-profile.self.cycles-pp.pmd_page_vaddr
      0.00            +0.1        0.06 ±  7%  perf-profile.self.cycles-pp.hrtimer_active
      0.10 ± 19%      +0.1        0.16 ±  9%  perf-profile.self.cycles-pp.timespec64_trunc
      0.08 ± 10%      +0.1        0.15 ± 33%  perf-profile.self.cycles-pp.mem_cgroup_from_task
      0.80            +0.1        0.87        perf-profile.self.cycles-pp.__mod_lruvec_state
      0.15 ±  2%      +0.1        0.23 ±  3%  perf-profile.self.cycles-pp.free_pages_and_swap_cache
      0.19 ±  2%      +0.1        0.27 ±  3%  perf-profile.self.cycles-pp.mark_page_accessed
      0.23 ±  5%      +0.1        0.32        perf-profile.self.cycles-pp.__might_sleep
      0.18 ± 15%      +0.1        0.28 ± 16%  perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
      0.20 ±  3%      +0.1        0.29        perf-profile.self.cycles-pp.do_page_fault
      0.23            +0.1        0.33        perf-profile.self.cycles-pp.finish_fault
      0.11 ±  4%      +0.1        0.21 ±  4%  perf-profile.self.cycles-pp.__tlb_remove_page_size
      0.28 ±  4%      +0.1        0.39 ±  3%  perf-profile.self.cycles-pp.xas_start
      0.21 ± 17%      +0.1        0.32 ±  6%  perf-profile.self.cycles-pp.fault_dirty_shared_page
      0.54            +0.1        0.66 ±  4%  perf-profile.self.cycles-pp.release_pages
      0.34            +0.1        0.46        perf-profile.self.cycles-pp.__set_page_dirty_no_writeback
      0.65            +0.1        0.78        perf-profile.self.cycles-pp.___might_sleep
      0.39            +0.1        0.53 ±  2%  perf-profile.self.cycles-pp.file_update_time
      0.29            +0.2        0.44        perf-profile.self.cycles-pp.set_page_dirty
      0.70 ±  6%      +0.2        0.86 ±  3%  perf-profile.self.cycles-pp.shmem_getpage_gfp
      0.56 ±  4%      +0.2        0.74 ±  5%  perf-profile.self.cycles-pp.pmd_devmap_trans_unstable
      0.71 ±  2%      +0.3        0.97        perf-profile.self.cycles-pp.shmem_fault
      0.59            +0.3        0.85        perf-profile.self.cycles-pp.swapgs_restore_regs_and_return_to_usermode
      0.73            +0.3        1.00        perf-profile.self.cycles-pp.page_fault
      1.12 ±  2%      +0.3        1.41        perf-profile.self.cycles-pp.page_mapping
      0.52 ±  2%      +0.3        0.82 ±  4%  perf-profile.self.cycles-pp.vmacache_find
      0.90            +0.3        1.22 ±  2%  perf-profile.self.cycles-pp.__perf_sw_event
      0.71 ±  2%      +0.3        1.05 ±  9%  perf-profile.self.cycles-pp.__mod_node_page_state
      0.91            +0.4        1.29        perf-profile.self.cycles-pp.alloc_set_pte
      1.29            +0.6        1.88        perf-profile.self.cycles-pp.__do_page_fault
      1.57            +0.7        2.23 ±  2%  perf-profile.self.cycles-pp.handle_mm_fault
      1.48            +0.7        2.17 ±  2%  perf-profile.self.cycles-pp.xas_load
      1.56 ±  3%      +0.7        2.26 ±  2%  perf-profile.self.cycles-pp.___perf_sw_event
      2.55 ±  4%      +1.1        3.64        perf-profile.self.cycles-pp.__handle_mm_fault
      2.99            +1.8        4.83 ±  8%  perf-profile.self.cycles-pp.unmap_page_range
      4.27            +1.9        6.13        perf-profile.self.cycles-pp.sync_regs
      3.14 ±  3%      +3.5        6.62 ±  6%  perf-profile.self.cycles-pp.find_get_entry
     18.54           +10.1       28.67        perf-profile.self.cycles-pp.testcase


                                                                                
                            will-it-scale.per_process_ops                       
                                                                                
  700000 +-+---------------------O------------------------------------------+   
         |              O  O  O     O  O   O  O  O  O  O  O     O  O        |   
         O  O  O  O  O                                       O        O     |   
  650000 +-+                                                                |   
         |                                                                  |   
         |                                                                  |   
  600000 +-+                                                                |   
         |                                                                  |   
  550000 +-+                                                                |   
         |                                                                  |   
         |                                                                  |   
  500000 +-+                                                                |   
         |       .+..  .+..      +..     ..+..+..+..+..     .+..     .+..   |   
         |..+..+.    +.    +.. ..   +..+.              +..+.    +..+.      .|   
  450000 +-+----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                will-it-scale.workload                          
                                                                                
  1.35e+08 +-+--------------------------------------------------------------+   
   1.3e+08 +-+            O  O  O  O  O     O O  O  O  O  O     O  O        |   
           O  O  O  O  O                 O                   O        O     |   
  1.25e+08 +-+                                                              |   
   1.2e+08 +-+                                                              |   
           |                                                                |   
  1.15e+08 +-+                                                              |   
   1.1e+08 +-+                                                              |   
  1.05e+08 +-+                                                              |   
           |                                                                |   
     1e+08 +-+                                                              |   
   9.5e+07 +-+                                                              |   
           |..+..  .+..  .+..     .+..     .+.+..+..+..     .+..     .+..  .|   
     9e+07 +-+   +.    +.    +..+.    +..+.            +..+.    +..+.    +. |   
   8.5e+07 +-+--------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.2.0-rc4-00046-g815744d7" of type "text/plain" (196393 bytes)

View attachment "job-script" of type "text/plain" (7331 bytes)

View attachment "job.yaml" of type "text/plain" (4949 bytes)

View attachment "reproduce" of type "text/plain" (316 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ