[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200901065019.GJ4299@shao2-debian>
Date: Tue, 1 Sep 2020 14:50:19 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Joonsoo Kim <iamjoonsoo.kim@....com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Johannes Weiner <hannes@...xchg.org>,
Vlastimil Babka <vbabka@...e.cz>,
Hugh Dickins <hughd@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
Mel Gorman <mgorman@...hsingularity.net>,
Michal Hocko <mhocko@...nel.org>,
Minchan Kim <minchan@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...el.com
Subject: [mm/workingset] 170b04b7ae: vm-scalability.throughput 11.2%
improvement
Greeting,
FYI, we noticed a 11.2% improvement of vm-scalability.throughput due to commit:
commit: 170b04b7ae49634df103810dad67b22cf8a99aa6 ("mm/workingset: prepare the workingset detection infrastructure for anon LRU")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: vm-scalability
on test machine: 104 threads Skylake with 192G memory
with following parameters:
runtime: 300s
size: 1T
test: lru-shm
cpufreq_governor: performance
ucode: 0x2006906
test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/1T/lkp-skl-fpga01/lru-shm/vm-scalability/0x2006906
commit:
b518154e59 ("mm/vmscan: protect the workingset on anonymous LRU")
170b04b7ae ("mm/workingset: prepare the workingset detection infrastructure for anon LRU")
b518154e59aab3ad 170b04b7ae49634df103810dad6
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.05 ± 5% -26.6% 0.03 ± 3% vm-scalability.free_time
417392 +12.3% 468921 vm-scalability.median
43811736 +11.2% 48709267 vm-scalability.throughput
243.55 -3.7% 234.62 vm-scalability.time.elapsed_time
243.55 -3.7% 234.62 vm-scalability.time.elapsed_time.max
57267 ± 10% -29.6% 40324 vm-scalability.time.involuntary_context_switches
2344 -7.0% 2180 vm-scalability.time.percent_of_cpu_this_job_got
3981 -20.7% 3155 ± 4% vm-scalability.time.system_time
1730 +13.3% 1960 ± 6% vm-scalability.time.user_time
75.25 +2.3% 77.00 vmstat.cpu.id
16.12 -2.8 13.32 ± 4% mpstat.cpu.all.sys%
6.96 +1.2 8.15 ± 6% mpstat.cpu.all.usr%
5511015 ± 5% -12.2% 4837482 ± 8% numa-meminfo.node0.Mapped
11416 ± 2% -13.5% 9871 ± 12% numa-meminfo.node0.PageTables
17772 ± 4% -10.2% 15957 meminfo.Active
17617 ± 4% -10.4% 15790 meminfo.Active(anon)
10864238 -9.0% 9886317 meminfo.Mapped
1386943 ± 4% -14.6% 1185103 ± 12% numa-vmstat.node0.nr_mapped
2842 ± 2% -14.4% 2433 ± 10% numa-vmstat.node0.nr_page_table_pages
17056 ± 27% +346.2% 76106 ± 63% numa-vmstat.node0.numa_other
1424454 ± 6% +19661.4% 2.815e+08 ±142% cpuidle.C1.time
84757 ± 4% +4598.2% 3982093 ±150% cpuidle.C1.usage
130831 ± 2% +230.4% 432228 ± 94% cpuidle.POLL.time
37049 ± 2% +90.7% 70640 ± 34% cpuidle.POLL.usage
1415 ± 7% +19.9% 1696 ± 6% slabinfo.dmaengine-unmap-16.active_objs
1415 ± 7% +19.9% 1696 ± 6% slabinfo.dmaengine-unmap-16.num_objs
3302 ± 4% -15.0% 2808 ± 5% slabinfo.fsnotify_mark_connector.active_objs
3302 ± 4% -15.0% 2808 ± 5% slabinfo.fsnotify_mark_connector.num_objs
4403 ± 5% -10.3% 3949 proc-vmstat.nr_active_anon
12556993 -0.7% 12471053 proc-vmstat.nr_inactive_anon
2697260 -9.9% 2430793 ± 5% proc-vmstat.nr_mapped
5545 -7.6% 5122 ± 5% proc-vmstat.nr_page_table_pages
12490172 -0.7% 12404206 proc-vmstat.nr_shmem
234365 +3.1% 241704 proc-vmstat.nr_unevictable
4403 ± 5% -10.3% 3949 proc-vmstat.nr_zone_active_anon
12556992 -0.7% 12471053 proc-vmstat.nr_zone_inactive_anon
234365 +3.1% 241704 proc-vmstat.nr_zone_unevictable
178.00 ± 4% +2661.2% 4915 ±156% interrupts.41:PCI-MSI.67633156-edge.eth0-TxRx-3
2366 ± 7% -14.0% 2036 ± 3% interrupts.CPU0.NMI:Non-maskable_interrupts
2366 ± 7% -14.0% 2036 ± 3% interrupts.CPU0.PMI:Performance_monitoring_interrupts
1108 ± 41% -51.8% 534.50 ± 5% interrupts.CPU1.CAL:Function_call_interrupts
588.25 ± 16% -16.5% 491.25 ± 3% interrupts.CPU17.CAL:Function_call_interrupts
159.50 ± 51% -37.9% 99.00 ± 9% interrupts.CPU22.RES:Rescheduling_interrupts
118.25 ± 12% -20.7% 93.75 ± 17% interrupts.CPU25.RES:Rescheduling_interrupts
2260 ± 2% -22.1% 1760 ± 24% interrupts.CPU26.NMI:Non-maskable_interrupts
2260 ± 2% -22.1% 1760 ± 24% interrupts.CPU26.PMI:Performance_monitoring_interrupts
178.00 ± 4% +2661.2% 4915 ±156% interrupts.CPU33.41:PCI-MSI.67633156-edge.eth0-TxRx-3
191.75 ± 58% -63.0% 71.00 ± 10% interrupts.CPU36.RES:Rescheduling_interrupts
140.75 ± 30% -41.9% 81.75 ± 27% interrupts.CPU40.RES:Rescheduling_interrupts
224.25 ± 72% -62.1% 85.00 ± 32% interrupts.CPU48.RES:Rescheduling_interrupts
726.75 ± 19% -29.9% 509.50 ± 5% interrupts.CPU52.CAL:Function_call_interrupts
734.25 ± 22% -22.9% 566.00 ± 19% interrupts.CPU53.CAL:Function_call_interrupts
939.00 ± 31% -37.3% 589.00 ± 4% interrupts.CPU54.CAL:Function_call_interrupts
710.25 ± 17% -24.7% 534.50 ± 11% interrupts.CPU56.CAL:Function_call_interrupts
648.75 ± 17% -22.5% 502.50 ± 4% interrupts.CPU61.CAL:Function_call_interrupts
790.75 ± 30% -30.7% 547.75 ± 16% interrupts.CPU65.CAL:Function_call_interrupts
2389 ± 11% -20.4% 1902 interrupts.CPU76.NMI:Non-maskable_interrupts
2389 ± 11% -20.4% 1902 interrupts.CPU76.PMI:Performance_monitoring_interrupts
2959 ± 42% -35.8% 1900 interrupts.CPU83.NMI:Non-maskable_interrupts
2959 ± 42% -35.8% 1900 interrupts.CPU83.PMI:Performance_monitoring_interrupts
879.50 ±137% -87.4% 110.75 ± 36% interrupts.CPU84.RES:Rescheduling_interrupts
2120 ± 17% -41.0% 1252 ± 42% interrupts.CPU85.NMI:Non-maskable_interrupts
2120 ± 17% -41.0% 1252 ± 42% interrupts.CPU85.PMI:Performance_monitoring_interrupts
94.00 ± 14% +42.8% 134.25 ± 20% interrupts.CPU85.RES:Rescheduling_interrupts
172.75 ± 46% -36.9% 109.00 ± 38% interrupts.CPU89.RES:Rescheduling_interrupts
1658 ± 87% -69.2% 510.50 ± 9% interrupts.CPU9.CAL:Function_call_interrupts
23465 ± 19% -45.1% 12876 ± 4% interrupts.RES:Rescheduling_interrupts
1.377e+10 +3.2% 1.422e+10 perf-stat.i.branch-instructions
48655466 +5.7% 51418236 perf-stat.i.cache-misses
6.967e+10 -7.5% 6.442e+10 perf-stat.i.cpu-cycles
1313 -5.1% 1247 perf-stat.i.cycles-between-cache-misses
1.402e+10 +3.1% 1.446e+10 perf-stat.i.dTLB-loads
2255710 +3.8% 2341501 perf-stat.i.dTLB-store-misses
3.814e+09 +3.6% 3.95e+09 perf-stat.i.dTLB-stores
3316652 +62.5% 5387931 ± 3% perf-stat.i.iTLB-load-misses
4.976e+10 +3.1% 5.132e+10 perf-stat.i.instructions
0.67 -7.7% 0.62 perf-stat.i.metric.GHz
305.99 +3.1% 315.37 perf-stat.i.metric.M/sec
2118563 +3.9% 2200504 perf-stat.i.minor-faults
7903787 +4.1% 8229701 perf-stat.i.node-stores
2118563 +3.9% 2200504 perf-stat.i.page-faults
1.40 -10.4% 1.26 perf-stat.overall.cpi
1432 -12.5% 1252 perf-stat.overall.cycles-between-cache-misses
39.88 ± 2% +12.5 52.34 ± 3% perf-stat.overall.iTLB-load-miss-rate%
15005 -36.5% 9534 ± 3% perf-stat.overall.instructions-per-iTLB-miss
0.71 +11.6% 0.80 perf-stat.overall.ipc
1.377e+10 +3.0% 1.419e+10 perf-stat.ps.branch-instructions
48656706 +5.5% 51312384 perf-stat.ps.cache-misses
6.968e+10 -7.7% 6.429e+10 perf-stat.ps.cpu-cycles
1.402e+10 +2.9% 1.442e+10 perf-stat.ps.dTLB-loads
2255925 +3.6% 2336219 perf-stat.ps.dTLB-store-misses
3.811e+09 +3.4% 3.94e+09 perf-stat.ps.dTLB-stores
3316265 +62.1% 5375280 ± 3% perf-stat.ps.iTLB-load-misses
4.975e+10 +2.9% 5.12e+10 perf-stat.ps.instructions
2119230 +3.6% 2195912 perf-stat.ps.minor-faults
7906342 +3.9% 8212206 perf-stat.ps.node-stores
2119230 +3.6% 2195912 perf-stat.ps.page-faults
10352 ± 2% +11.6% 11557 ± 6% softirqs.CPU100.RCU
10309 ± 2% +13.9% 11740 ± 10% softirqs.CPU101.RCU
10693 +18.0% 12621 ± 10% softirqs.CPU15.RCU
10974 ± 2% +9.5% 12021 ± 3% softirqs.CPU17.RCU
10715 ± 4% +11.6% 11958 ± 5% softirqs.CPU19.RCU
11394 ± 2% +6.8% 12171 ± 5% softirqs.CPU2.RCU
10500 ± 5% +18.0% 12387 ± 8% softirqs.CPU21.RCU
10393 ± 4% +13.8% 11830 ± 3% softirqs.CPU23.RCU
10377 ± 9% +14.0% 11832 ± 5% softirqs.CPU25.RCU
10270 ± 6% +17.1% 12023 ± 11% softirqs.CPU29.RCU
11134 ± 3% +14.1% 12700 ± 10% softirqs.CPU32.RCU
10837 ± 3% +10.2% 11940 ± 3% softirqs.CPU37.RCU
25056 ± 3% -10.7% 22376 ± 10% softirqs.CPU4.SCHED
10758 ± 2% +13.6% 12226 ± 4% softirqs.CPU41.RCU
10704 ± 2% +14.5% 12257 ± 9% softirqs.CPU45.RCU
10480 ± 6% +8.1% 11327 ± 4% softirqs.CPU47.RCU
10427 ± 3% +8.9% 11359 ± 6% softirqs.CPU48.RCU
10105 ± 4% +25.4% 12673 ± 14% softirqs.CPU49.RCU
10258 ± 4% +9.3% 11210 ± 4% softirqs.CPU52.RCU
11962 ± 18% -16.5% 9990 ± 5% softirqs.CPU53.RCU
10278 +11.4% 11454 ± 3% softirqs.CPU58.RCU
10115 ± 3% +11.6% 11288 ± 4% softirqs.CPU59.RCU
10227 ± 3% +13.7% 11624 ± 5% softirqs.CPU60.RCU
10524 ± 5% +8.5% 11423 ± 4% softirqs.CPU62.RCU
10546 ± 3% +11.8% 11790 ± 4% softirqs.CPU64.RCU
10005 ± 3% +13.7% 11378 ± 4% softirqs.CPU65.RCU
10201 ± 2% +17.3% 11969 ± 4% softirqs.CPU66.RCU
10367 ± 2% +12.3% 11637 ± 2% softirqs.CPU67.RCU
10233 ± 4% +16.5% 11920 ± 3% softirqs.CPU68.RCU
10701 ± 4% +8.6% 11623 ± 3% softirqs.CPU71.RCU
10005 ± 4% +13.7% 11374 ± 2% softirqs.CPU72.RCU
9034 ± 3% +14.8% 10368 softirqs.CPU75.RCU
9290 ± 3% +11.8% 10386 ± 3% softirqs.CPU76.RCU
10645 ± 5% +14.3% 12167 ± 2% softirqs.CPU81.RCU
10657 ± 3% +11.5% 11884 ± 3% softirqs.CPU82.RCU
10582 +35.4% 14324 ± 7% softirqs.CPU83.RCU
10474 ± 4% +24.0% 12983 ± 19% softirqs.CPU86.RCU
10093 ± 2% +16.8% 11788 ± 11% softirqs.CPU92.RCU
10275 +14.4% 11757 ± 8% softirqs.CPU95.RCU
10499 ± 4% +14.8% 12057 ± 9% softirqs.CPU96.RCU
10228 +16.6% 11931 ± 10% softirqs.CPU97.RCU
1120806 ± 2% +9.8% 1230893 ± 3% softirqs.RCU
27146 ± 10% -43.0% 15461 ± 17% sched_debug.cfs_rq:/.exec_clock.avg
35739 ± 9% -33.4% 23815 ± 13% sched_debug.cfs_rq:/.exec_clock.max
24614 ± 11% -43.1% 14012 ± 17% sched_debug.cfs_rq:/.exec_clock.min
2252 ± 14% -28.3% 1613 ± 8% sched_debug.cfs_rq:/.exec_clock.stddev
26221 ± 25% -48.2% 13579 ± 29% sched_debug.cfs_rq:/.load.avg
31.57 ± 21% -38.4% 19.45 ± 14% sched_debug.cfs_rq:/.load_avg.avg
594.11 ± 16% -18.6% 483.79 ± 5% sched_debug.cfs_rq:/.load_avg.max
102.33 ± 16% -29.0% 72.67 ± 11% sched_debug.cfs_rq:/.load_avg.stddev
2729295 ± 10% -43.0% 1556498 ± 17% sched_debug.cfs_rq:/.min_vruntime.avg
2841927 ± 9% -43.0% 1620688 ± 17% sched_debug.cfs_rq:/.min_vruntime.max
2545899 ± 11% -42.6% 1461699 ± 17% sched_debug.cfs_rq:/.min_vruntime.min
75990 ± 36% -56.0% 33447 ± 33% sched_debug.cfs_rq:/.min_vruntime.stddev
0.34 ± 10% -30.0% 0.24 ± 16% sched_debug.cfs_rq:/.nr_running.avg
20.09 ± 12% -43.1% 11.43 ± 19% sched_debug.cfs_rq:/.nr_spread_over.avg
164.81 ± 16% -37.1% 103.71 ± 20% sched_debug.cfs_rq:/.nr_spread_over.max
33.80 ± 15% -39.4% 20.50 ± 13% sched_debug.cfs_rq:/.nr_spread_over.stddev
375.78 ± 9% -24.0% 285.69 ± 12% sched_debug.cfs_rq:/.runnable_avg.avg
-183105 -63.3% -67248 sched_debug.cfs_rq:/.spread0.min
76034 ± 37% -56.0% 33448 ± 33% sched_debug.cfs_rq:/.spread0.stddev
369.43 ± 9% -23.5% 282.61 ± 11% sched_debug.cfs_rq:/.util_avg.avg
696.88 ± 4% -20.5% 553.75 ± 9% sched_debug.cfs_rq:/.util_est_enqueued.max
93.11 ± 4% -17.5% 76.85 ± 9% sched_debug.cfs_rq:/.util_est_enqueued.stddev
131720 ± 7% +21.2% 159588 ± 5% sched_debug.cpu.avg_idle.stddev
153356 ± 8% -29.2% 108595 ± 12% sched_debug.cpu.clock.avg
153364 ± 8% -29.2% 108602 ± 12% sched_debug.cpu.clock.max
153348 ± 8% -29.2% 108588 ± 12% sched_debug.cpu.clock.min
151791 ± 8% -29.1% 107548 ± 12% sched_debug.cpu.clock_task.avg
152331 ± 8% -29.2% 107864 ± 12% sched_debug.cpu.clock_task.max
146177 ± 8% -30.1% 102228 ± 12% sched_debug.cpu.clock_task.min
10821 ± 8% -28.9% 7692 ± 12% sched_debug.cpu.curr->pid.max
0.32 ± 12% -27.4% 0.23 ± 19% sched_debug.cpu.nr_running.avg
5319 ± 9% -29.2% 3764 ± 11% sched_debug.cpu.nr_switches.avg
2044 ± 8% -27.5% 1481 ± 12% sched_debug.cpu.nr_switches.min
30.00 ± 28% +60.0% 48.00 ± 15% sched_debug.cpu.nr_uninterruptible.max
6.75 ± 14% +21.5% 8.19 ± 5% sched_debug.cpu.nr_uninterruptible.stddev
3776 ± 12% -41.9% 2194 ± 20% sched_debug.cpu.sched_count.avg
1236 ± 9% -41.2% 727.58 ± 17% sched_debug.cpu.sched_count.min
1477 ± 10% -36.7% 935.58 ± 20% sched_debug.cpu.sched_goidle.avg
386.73 ± 9% -37.5% 241.60 ± 22% sched_debug.cpu.sched_goidle.min
1705 ± 12% -42.8% 975.64 ± 21% sched_debug.cpu.ttwu_count.avg
24239 ± 22% -31.5% 16612 ± 15% sched_debug.cpu.ttwu_count.max
522.40 ± 13% -41.1% 307.79 ± 21% sched_debug.cpu.ttwu_count.min
2783 ± 15% -33.1% 1862 ± 12% sched_debug.cpu.ttwu_count.stddev
682.47 ± 15% -48.1% 354.52 ± 18% sched_debug.cpu.ttwu_local.avg
5220 ± 16% -59.3% 2122 ± 18% sched_debug.cpu.ttwu_local.max
282.79 ± 13% -39.4% 171.29 ± 21% sched_debug.cpu.ttwu_local.min
709.90 ± 15% -57.2% 304.11 ± 13% sched_debug.cpu.ttwu_local.stddev
153349 ± 8% -29.2% 108590 ± 12% sched_debug.cpu_clk
152855 ± 8% -29.3% 108094 ± 12% sched_debug.ktime
154041 ± 8% -29.1% 109253 ± 12% sched_debug.sched_clk
25.29 ±119% -25.3 0.00 perf-profile.calltrace.cycles-pp.asm_exc_page_fault
24.30 ±119% -24.3 0.00 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault
24.14 ±119% -24.1 0.00 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
23.72 ±119% -23.7 0.00 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
26.28 ± 10% -14.8 11.48 ± 3% perf-profile.calltrace.cycles-pp.lru_cache_add.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
26.12 ± 10% -14.8 11.32 ± 3% perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.lru_cache_add.shmem_getpage_gfp.shmem_fault.__do_fault
23.96 ± 10% -14.5 9.45 ± 4% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_cache_add.shmem_getpage_gfp
24.02 ± 10% -14.5 9.51 ± 4% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_cache_add.shmem_getpage_gfp.shmem_fault
56.91 ± 12% -11.9 45.00 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
56.41 ± 12% -11.9 44.51 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
48.60 ± 12% -11.6 37.03 perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
48.89 ± 12% -11.6 37.32 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
48.96 ± 12% -11.6 37.39 perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
2.75 ± 24% -2.5 0.30 ±101% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlinkat
2.75 ± 24% -2.5 0.30 ±101% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlinkat
2.75 ± 24% -2.5 0.30 ±101% perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlinkat
2.75 ± 24% -2.5 0.30 ±101% perf-profile.calltrace.cycles-pp.evict.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlinkat
2.75 ± 24% -2.5 0.30 ±101% perf-profile.calltrace.cycles-pp.unlinkat
1.82 ± 18% -0.6 1.26 ± 2% perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.82 ± 18% -0.6 1.26 ± 2% perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
1.80 ± 18% -0.6 1.24 ± 2% perf-profile.calltrace.cycles-pp.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
1.81 ± 18% -0.6 1.25 ± 2% perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
1.81 ± 18% -0.6 1.25 ± 2% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
2.20 ± 11% -0.5 1.73 perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
2.15 ± 11% -0.5 1.67 perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
1.81 ± 11% -0.5 1.33 perf-profile.calltrace.cycles-pp.page_add_file_rmap.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault
1.35 ± 16% -0.3 1.02 ± 10% perf-profile.calltrace.cycles-pp.ret_from_fork
1.35 ± 16% -0.3 1.02 ± 10% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
1.33 ± 16% -0.3 1.00 ± 10% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork
1.32 ± 16% -0.3 0.99 ± 10% perf-profile.calltrace.cycles-pp.drm_fb_helper_dirty_work.process_one_work.worker_thread.kthread.ret_from_fork
1.33 ± 17% -0.3 1.00 ± 10% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
1.28 ± 16% -0.3 0.96 ± 10% perf-profile.calltrace.cycles-pp.memcpy_erms.drm_fb_helper_dirty_work.process_one_work.worker_thread.kthread
1.68 ± 7% -0.2 1.48 perf-profile.calltrace.cycles-pp.__pagevec_lru_add_fn.pagevec_lru_move_fn.lru_cache_add.shmem_getpage_gfp.shmem_fault
0.62 ± 60% +0.4 1.05 ± 3% perf-profile.calltrace.cycles-pp.__irqentry_text_end.do_access
1.23 ± 61% +1.2 2.40 perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_exc_page_fault.do_access
2.24 ± 21% +1.2 3.48 ± 5% perf-profile.calltrace.cycles-pp.get_mem_cgroup_from_mm.mem_cgroup_charge.shmem_add_to_page_cache.shmem_getpage_gfp.shmem_fault
0.87 ±114% +1.6 2.47 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.88 ±114% +1.6 2.47 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
9.19 ± 17% +2.4 11.55 ± 3% perf-profile.calltrace.cycles-pp.shmem_add_to_page_cache.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
0.00 +2.4 2.38 ± 3% perf-profile.calltrace.cycles-pp.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +2.4 2.38 ± 3% perf-profile.calltrace.cycles-pp.evict.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.64 ± 21% +2.8 8.47 ± 4% perf-profile.calltrace.cycles-pp.mem_cgroup_charge.shmem_add_to_page_cache.shmem_getpage_gfp.shmem_fault.__do_fault
13.37 ± 61% +9.1 22.42 perf-profile.calltrace.cycles-pp.do_rw_once
26.17 ± 10% -14.8 11.35 ± 3% perf-profile.children.cycles-pp.pagevec_lru_move_fn
26.28 ± 10% -14.8 11.48 ± 3% perf-profile.children.cycles-pp.lru_cache_add
24.20 ± 10% -14.5 9.67 ± 4% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
24.12 ± 10% -14.5 9.62 ± 4% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
56.45 ± 12% -11.9 44.54 perf-profile.children.cycles-pp.do_fault
56.95 ± 12% -11.9 45.05 perf-profile.children.cycles-pp.__handle_mm_fault
59.12 ± 12% -11.8 47.29 perf-profile.children.cycles-pp.do_user_addr_fault
58.05 ± 12% -11.8 46.22 perf-profile.children.cycles-pp.handle_mm_fault
59.51 ± 12% -11.8 47.73 perf-profile.children.cycles-pp.exc_page_fault
48.62 ± 12% -11.6 37.05 perf-profile.children.cycles-pp.shmem_getpage_gfp
48.89 ± 12% -11.6 37.32 perf-profile.children.cycles-pp.shmem_fault
48.96 ± 12% -11.6 37.40 perf-profile.children.cycles-pp.__do_fault
63.35 ± 10% -10.5 52.82 perf-profile.children.cycles-pp.asm_exc_page_fault
2.75 ± 24% -2.2 0.51 ± 19% perf-profile.children.cycles-pp.unlinkat
1.83 ± 18% -0.6 1.27 ± 2% perf-profile.children.cycles-pp.unmap_region
1.84 ± 18% -0.6 1.27 ± 2% perf-profile.children.cycles-pp.__do_munmap
1.83 ± 18% -0.6 1.26 ± 2% perf-profile.children.cycles-pp.__vm_munmap
1.82 ± 18% -0.6 1.26 ± 2% perf-profile.children.cycles-pp.__x64_sys_munmap
1.82 ± 18% -0.6 1.26 ± 2% perf-profile.children.cycles-pp.zap_pte_range
1.82 ± 18% -0.6 1.27 ± 2% perf-profile.children.cycles-pp.unmap_page_range
1.82 ± 18% -0.6 1.27 ± 2% perf-profile.children.cycles-pp.unmap_vmas
1.16 ± 17% -0.5 0.66 ± 4% perf-profile.children.cycles-pp.page_remove_rmap
2.21 ± 11% -0.5 1.73 perf-profile.children.cycles-pp.finish_fault
1.82 ± 11% -0.5 1.34 perf-profile.children.cycles-pp.page_add_file_rmap
2.56 ± 11% -0.5 2.09 perf-profile.children.cycles-pp.alloc_set_pte
1.35 ± 16% -0.3 1.02 ± 10% perf-profile.children.cycles-pp.ret_from_fork
1.35 ± 16% -0.3 1.02 ± 10% perf-profile.children.cycles-pp.kthread
1.33 ± 16% -0.3 1.00 ± 10% perf-profile.children.cycles-pp.worker_thread
1.32 ± 16% -0.3 0.99 ± 10% perf-profile.children.cycles-pp.drm_fb_helper_dirty_work
1.33 ± 17% -0.3 1.00 ± 10% perf-profile.children.cycles-pp.process_one_work
1.32 ± 16% -0.3 0.99 ± 10% perf-profile.children.cycles-pp.memcpy_erms
1.69 ± 7% -0.2 1.50 perf-profile.children.cycles-pp.__pagevec_lru_add_fn
0.08 ± 11% -0.0 0.06 ± 9% perf-profile.children.cycles-pp.mem_cgroup_page_lruvec
0.27 ± 12% +0.1 0.32 ± 6% perf-profile.children.cycles-pp.xas_create_range
0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.get_vma_policy
0.00 +0.2 0.23 ± 6% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
2.25 ± 21% +1.2 3.49 ± 5% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
9.22 ± 17% +2.4 11.59 ± 3% perf-profile.children.cycles-pp.shmem_add_to_page_cache
5.66 ± 21% +2.8 8.49 ± 4% perf-profile.children.cycles-pp.mem_cgroup_charge
12.88 ± 61% +9.7 22.58 perf-profile.children.cycles-pp.do_rw_once
24.12 ± 10% -14.5 9.62 ± 4% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.36 ± 15% -0.6 0.80 perf-profile.self.cycles-pp.shmem_add_to_page_cache
0.65 ± 12% -0.4 0.26 ± 9% perf-profile.self.cycles-pp.page_add_file_rmap
1.31 ± 16% -0.3 0.98 ± 10% perf-profile.self.cycles-pp.memcpy_erms
0.48 ± 17% -0.3 0.17 ± 2% perf-profile.self.cycles-pp.page_remove_rmap
1.06 ± 11% -0.3 0.79 perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
0.80 ± 7% -0.2 0.57 perf-profile.self.cycles-pp.__pagevec_lru_add_fn
0.07 ± 12% -0.0 0.04 ± 57% perf-profile.self.cycles-pp.mem_cgroup_page_lruvec
0.08 ± 15% -0.0 0.06 ± 6% perf-profile.self.cycles-pp.truncate_cleanup_page
0.08 ± 15% +0.0 0.12 ± 5% perf-profile.self.cycles-pp.xas_create_range
0.15 ± 19% +0.0 0.19 ± 4% perf-profile.self.cycles-pp.xas_find_conflict
0.00 +0.1 0.14 ± 3% perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
2.23 ± 21% +1.2 3.46 ± 5% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
2.15 ± 25% +1.5 3.66 ± 4% perf-profile.self.cycles-pp.mem_cgroup_charge
6.88 ± 61% +4.0 10.87 perf-profile.self.cycles-pp.do_access
10.47 ± 61% +8.7 19.20 perf-profile.self.cycles-pp.do_rw_once
vm-scalability.throughput
5e+07 +-----------------------------------------------------------------+
| O O O O |
4.9e+07 |-+ O O O O O O |
4.8e+07 |-O O O O O O O O O O O O O |
| O O O O |
4.7e+07 |-+ O |
4.6e+07 |.+. .+.+ |
| + + |
4.5e+07 |-+ + +. .+.. .+ |
4.4e+07 |-+ + .. +. .+ .+ : .+. .+.|
| +.+ +. .+ + + : .+ + |
4.3e+07 |-+ +.+.+.+ + + + + + |
4.2e+07 |-+ : : + + + |
| : : + |
4.1e+07 +-----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.8.0-12308-g170b04b7ae496" of type "text/plain" (170150 bytes)
View attachment "job-script" of type "text/plain" (7566 bytes)
View attachment "job.yaml" of type "text/plain" (5290 bytes)
View attachment "reproduce" of type "text/plain" (345948 bytes)
Powered by blists - more mailing lists