[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210104074651.GD4811@xsang-OptiPlex-9020>
Date: Mon, 4 Jan 2021 15:46:51 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Feng Tang <feng.tang@...el.com>
Cc: 0day robot <lkp@...el.com>, Shakeel Butt <shakeelb@...gle.com>,
Roman Gushchin <guro@...com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...e.com>,
Johannes Weiner <hannes@...xchg.org>,
Vladimir Davydov <vdavydov.dev@...il.com>, linux-mm@...ck.org,
andi.kleen@...el.com, tim.c.chen@...el.com, dave.hansen@...el.com
Subject: [mm] 4d8191276e: vm-scalability.throughput 43.4% improvement
Greeting,
FYI, we noticed a 43.4% improvement of vm-scalability.throughput due to commit:
commit: 4d8191276e029a0ea7ef58f329006972551dbe29 ("[PATCH 2/2] mm: memcg: add a new MEMCG_UPDATE_BATCH")
url: https://github.com/0day-ci/linux/commits/Feng-Tang/mm-page_counter-relayout-structure-to-reduce-false-sharing/20201229-223627
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git dea8dcf2a9fa8cc540136a6cd885c3beece16ec3
in testcase: vm-scalability
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:
runtime: 300s
size: 1T
test: lru-shm
cpufreq_governor: performance
ucode: 0x5003003
test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/1T/lkp-csl-2ap4/lru-shm/vm-scalability/0x5003003
commit:
f13e623fa8 ("mm: page_counter: relayout structure to reduce false sharing")
4d8191276e ("mm: memcg: add a new MEMCG_UPDATE_BATCH")
f13e623fa86ab7de 4d8191276e029a0ea7ef58f3290
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
0:4 58% 2:4 perf-profile.calltrace.cycles-pp.sync_regs.error_entry.do_access
0:4 69% 3:4 perf-profile.calltrace.cycles-pp.error_entry.do_access
3:4 17% 3:4 perf-profile.children.cycles-pp.error_entry
0:4 3% 0:4 perf-profile.self.cycles-pp.error_entry
%stddev %change %stddev
\ | \
0.02 ± 2% -38.1% 0.01 vm-scalability.free_time
318058 ± 4% +41.6% 450322 vm-scalability.median
1.29 ± 20% +0.6 1.87 ± 18% vm-scalability.median_stddev%
60698434 ± 4% +43.4% 87053351 vm-scalability.throughput
55128 ± 6% -25.3% 41169 ± 2% vm-scalability.time.involuntary_context_switches
7.077e+08 +8.8% 7.698e+08 vm-scalability.time.minor_page_faults
3381 ± 4% -24.5% 2551 vm-scalability.time.percent_of_cpu_this_job_got
8274 ± 4% -32.2% 5609 vm-scalability.time.system_time
2079 ± 3% +8.7% 2260 ± 3% vm-scalability.time.user_time
67274 +7.1% 72042 vm-scalability.time.voluntary_context_switches
3.17e+09 +8.8% 3.448e+09 vm-scalability.workload
14.50 ± 5% -4.6 9.88 mpstat.cpu.all.sys%
79.75 +5.3% 84.00 vmstat.cpu.id
35.50 ± 4% -23.2% 27.25 vmstat.procs.r
8528 +2.1% 8707 vmstat.system.cs
92369 ± 4% -16.8% 76889 ± 2% meminfo.Active
91324 ± 4% -16.9% 75846 ± 2% meminfo.Active(anon)
8538039 ± 5% -23.3% 6548365 ± 3% meminfo.Mapped
20654 ± 4% -9.7% 18643 meminfo.PageTables
1.759e+08 ± 3% +9.7% 1.929e+08 numa-numastat.node1.local_node
1.759e+08 ± 3% +9.7% 1.93e+08 numa-numastat.node1.numa_hit
1.76e+08 +11.7% 1.967e+08 numa-numastat.node3.local_node
1.761e+08 +11.7% 1.967e+08 numa-numastat.node3.numa_hit
0.83 ± 20% -70.0% 0.25 ±137% sched_debug.cfs_rq:/.load_avg.min
4209545 ± 11% -23.4% 3224267 ± 10% sched_debug.cfs_rq:/.min_vruntime.avg
4780801 ± 12% -24.9% 3589252 ± 11% sched_debug.cfs_rq:/.min_vruntime.max
3665900 ± 10% -21.0% 2895741 ± 10% sched_debug.cfs_rq:/.min_vruntime.min
801.36 ± 11% +18.7% 950.87 ± 10% sched_debug.cfs_rq:/.util_est_enqueued.max
-22.25 -31.6% -15.21 sched_debug.cpu.nr_uninterruptible.min
262824 ± 42% +68.3% 442338 ± 20% numa-meminfo.node0.AnonPages.max
2362824 ± 15% -30.3% 1645765 ± 11% numa-meminfo.node0.Mapped
1983339 ± 7% -24.3% 1501294 ± 4% numa-meminfo.node1.Mapped
4331 ± 22% -25.1% 3244 ± 2% numa-meminfo.node1.PageTables
2047399 ± 4% -19.5% 1648724 ± 11% numa-meminfo.node2.Mapped
80277 ± 6% -17.8% 65959 ± 6% numa-meminfo.node3.Active
80217 ± 6% -18.3% 65547 ± 5% numa-meminfo.node3.Active(anon)
7105 ± 4% +13.5% 8061 ± 12% numa-meminfo.node3.KernelStack
2001243 ± 4% -19.7% 1607335 ± 6% numa-meminfo.node3.Mapped
22819 ± 4% -17.0% 18944 ± 3% proc-vmstat.nr_active_anon
2132359 ± 5% -23.4% 1632702 proc-vmstat.nr_mapped
5216 ± 5% -10.9% 4645 proc-vmstat.nr_page_table_pages
22819 ± 4% -17.0% 18944 ± 3% proc-vmstat.nr_zone_active_anon
7.104e+08 +8.8% 7.725e+08 proc-vmstat.numa_hit
7.101e+08 +8.8% 7.723e+08 proc-vmstat.numa_local
54433 ± 6% -19.1% 44024 proc-vmstat.pgactivate
7.114e+08 +8.7% 7.736e+08 proc-vmstat.pgalloc_normal
7.09e+08 +8.8% 7.71e+08 proc-vmstat.pgfault
7.114e+08 +8.7% 7.735e+08 proc-vmstat.pgfree
268108 ± 2% +8.6% 291111 proc-vmstat.pgreuse
584988 ± 17% -29.6% 412076 ± 13% numa-vmstat.node0.nr_mapped
497690 ± 8% -23.6% 380252 numa-vmstat.node1.nr_mapped
1081 ± 23% -25.4% 806.50 ± 6% numa-vmstat.node1.nr_page_table_pages
90657677 ± 3% +9.5% 99278002 numa-vmstat.node1.numa_hit
90503652 ± 3% +9.5% 99112443 numa-vmstat.node1.numa_local
507882 ± 4% -20.7% 402874 ± 12% numa-vmstat.node2.nr_mapped
20043 ± 6% -18.3% 16379 ± 5% numa-vmstat.node3.nr_active_anon
7112 ± 5% +13.2% 8050 ± 12% numa-vmstat.node3.nr_kernel_stack
501812 ± 4% -20.6% 398460 ± 5% numa-vmstat.node3.nr_mapped
20043 ± 6% -18.3% 16379 ± 5% numa-vmstat.node3.nr_zone_active_anon
90627456 +11.8% 1.013e+08 numa-vmstat.node3.numa_hit
90496033 +11.9% 1.012e+08 numa-vmstat.node3.numa_local
131566 ± 16% -27.3% 95685 ± 13% numa-vmstat.node3.numa_other
33495 ± 2% +8.5% 36327 ± 2% softirqs.CPU122.SCHED
33428 ± 2% +10.9% 37058 ± 2% softirqs.CPU123.SCHED
32980 ± 3% +10.1% 36319 softirqs.CPU127.SCHED
33546 ± 2% +8.7% 36459 ± 2% softirqs.CPU132.SCHED
33179 ± 5% +10.0% 36510 softirqs.CPU135.SCHED
31601 ± 8% +15.5% 36501 softirqs.CPU136.SCHED
33601 ± 2% +8.4% 36421 ± 2% softirqs.CPU138.SCHED
32180 ± 2% +13.0% 36361 softirqs.CPU142.SCHED
32887 ± 4% +12.0% 36840 ± 2% softirqs.CPU143.SCHED
33194 ± 3% +7.2% 35597 ± 2% softirqs.CPU162.SCHED
33042 ± 4% +10.6% 36534 softirqs.CPU27.SCHED
32749 ± 2% +11.6% 36545 softirqs.CPU36.SCHED
33101 ± 2% +10.3% 36497 softirqs.CPU42.SCHED
32471 ± 2% +11.1% 36075 softirqs.CPU44.SCHED
32250 ± 3% +10.3% 35566 ± 2% softirqs.CPU46.SCHED
32214 ± 6% +12.4% 36197 softirqs.CPU47.SCHED
7658 ± 13% +30.4% 9987 ± 21% softirqs.CPU67.RCU
7589 ± 13% +27.7% 9692 ± 12% softirqs.CPU69.RCU
0.05 ±136% -86.7% 0.01 ± 39% perf-sched.sch_delay.max.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
0.00 ±101% +222.2% 0.01 ± 39% perf-sched.sch_delay.max.ms.io_schedule.__lock_page_killable.filemap_fault.__do_fault
5.68 ± 60% -98.3% 0.10 ± 45% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
0.24 ± 73% +8.4e+05% 2049 ±173% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_select
5030 ± 14% +52.1% 7652 ± 12% perf-sched.total_wait_and_delay.max.ms
5029 ± 14% +50.6% 7576 ± 11% perf-sched.total_wait_time.max.ms
5.51 ± 18% -50.9% 2.71 ± 49% perf-sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
0.98 ± 18% -46.1% 0.53 ± 48% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
4.82 ± 5% +14.3% 5.51 ± 8% perf-sched.wait_and_delay.avg.ms.preempt_schedule_common._cond_resched.stop_one_cpu.affine_move_task.__set_cpus_allowed_ptr
214.13 ± 12% +46.1% 312.79 ± 8% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
1146 ± 4% -44.9% 632.00 ± 12% perf-sched.wait_and_delay.count.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
3107 ± 7% -37.9% 1930 ± 8% perf-sched.wait_and_delay.count.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
67.75 ± 10% -29.9% 47.50 ± 3% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
1185 ± 26% +70.3% 2019 ± 21% perf-sched.wait_and_delay.max.ms.preempt_schedule_common._cond_resched.stop_one_cpu.affine_move_task.__set_cpus_allowed_ptr
2862 ± 23% +48.4% 4248 ± 28% perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
4513 ± 10% +54.1% 6953 ± 9% perf-sched.wait_and_delay.max.ms.worker_thread.kthread.ret_from_fork
5.50 ± 18% -51.0% 2.70 ± 49% perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
0.96 ± 17% -47.3% 0.51 ± 50% perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
4.82 ± 5% +14.3% 5.51 ± 8% perf-sched.wait_time.avg.ms.preempt_schedule_common._cond_resched.stop_one_cpu.affine_move_task.__set_cpus_allowed_ptr
0.06 ± 21% +182.5% 0.16 ± 73% perf-sched.wait_time.avg.ms.preempt_schedule_common._cond_resched.stop_one_cpu.sched_exec.bprm_execve
213.87 ± 12% +46.1% 312.55 ± 8% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
1.50 ± 58% +2098.6% 33.07 ±139% perf-sched.wait_time.avg.ms.schedule_timeout.__skb_wait_for_more_packets.unix_dgram_recvmsg.__sys_recvfrom
8.74 ± 3% -24.6% 6.59 ± 6% perf-sched.wait_time.avg.ms.sigsuspend.__x64_sys_rt_sigsuspend.do_syscall_64.entry_SYSCALL_64_after_hwframe
2303 ± 6% -19.1% 1864 ± 15% perf-sched.wait_time.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
16.35 ± 27% -20.4% 13.01 perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown]
1185 ± 26% +70.3% 2019 ± 21% perf-sched.wait_time.max.ms.preempt_schedule_common._cond_resched.stop_one_cpu.affine_move_task.__set_cpus_allowed_ptr
2862 ± 23% +48.4% 4248 ± 28% perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_sys_poll
1.50 ± 58% +2098.6% 33.07 ±139% perf-sched.wait_time.max.ms.schedule_timeout.__skb_wait_for_more_packets.unix_dgram_recvmsg.__sys_recvfrom
533.83 ± 18% -56.5% 232.10 ± 37% perf-sched.wait_time.max.ms.sigsuspend.__x64_sys_rt_sigsuspend.do_syscall_64.entry_SYSCALL_64_after_hwframe
4512 ± 10% +54.1% 6953 ± 9% perf-sched.wait_time.max.ms.worker_thread.kthread.ret_from_fork
1.473e+10 +8.3% 1.596e+10 perf-stat.i.branch-instructions
67391842 -13.6% 58252334 ± 2% perf-stat.i.cache-misses
1.82 -10.5% 1.63 ± 2% perf-stat.i.cpi
1.141e+11 ± 3% -20.9% 9.022e+10 perf-stat.i.cpu-cycles
1585 ± 2% -4.5% 1514 ± 3% perf-stat.i.cycles-between-cache-misses
1.505e+10 +8.1% 1.626e+10 perf-stat.i.dTLB-loads
4.219e+09 +7.7% 4.544e+09 perf-stat.i.dTLB-stores
5.387e+10 +8.1% 5.822e+10 perf-stat.i.instructions
0.56 +11.1% 0.63 ± 2% perf-stat.i.ipc
0.59 ± 3% -21.4% 0.47 perf-stat.i.metric.GHz
178.65 +7.4% 191.79 perf-stat.i.metric.M/sec
2221317 +9.1% 2423232 perf-stat.i.minor-faults
5686701 ± 3% -20.6% 4517487 ± 6% perf-stat.i.node-load-misses
2717590 ± 5% -44.2% 1517657 ± 3% perf-stat.i.node-store-misses
8459610 ± 2% +10.0% 9304231 perf-stat.i.node-stores
2221319 +9.1% 2423234 perf-stat.i.page-faults
2.13 ± 4% -27.1% 1.55 perf-stat.overall.cpi
1693 ± 3% -8.6% 1547 ± 2% perf-stat.overall.cycles-between-cache-misses
6488 +8.4% 7033 ± 4% perf-stat.overall.instructions-per-iTLB-miss
0.47 ± 4% +36.9% 0.65 perf-stat.overall.ipc
82.36 -3.7 78.65 perf-stat.overall.node-load-miss-rate%
24.27 ± 5% -10.3 13.95 ± 3% perf-stat.overall.node-store-miss-rate%
1.522e+10 +7.4% 1.634e+10 perf-stat.ps.branch-instructions
69718706 -14.4% 59676985 ± 2% perf-stat.ps.cache-misses
1.181e+11 ± 3% -21.8% 9.228e+10 perf-stat.ps.cpu-cycles
1.553e+10 +7.1% 1.663e+10 perf-stat.ps.dTLB-loads
3543187 ± 4% +8.0% 3825044 ± 4% perf-stat.ps.dTLB-store-misses
4.332e+09 +6.9% 4.633e+09 perf-stat.ps.dTLB-stores
5.554e+10 +7.2% 5.953e+10 perf-stat.ps.instructions
1.46 +18.1% 1.72 ± 9% perf-stat.ps.major-faults
2307245 +7.9% 2489992 perf-stat.ps.minor-faults
5847320 ± 3% -21.4% 4594154 ± 6% perf-stat.ps.node-load-misses
2817678 ± 5% -45.0% 1550607 ± 3% perf-stat.ps.node-store-misses
8791418 +8.8% 9563343 perf-stat.ps.node-stores
2307247 +7.9% 2489994 perf-stat.ps.page-faults
1.706e+13 +8.0% 1.843e+13 perf-stat.total.instructions
1903 ± 39% +114.3% 4079 ± 53% interrupts.CPU1.CAL:Function_call_interrupts
1773 ± 38% +57.2% 2787 ± 32% interrupts.CPU102.CAL:Function_call_interrupts
1683 ± 44% +146.8% 4155 ± 74% interrupts.CPU113.CAL:Function_call_interrupts
1700 ± 46% +140.0% 4081 ± 75% interrupts.CPU115.CAL:Function_call_interrupts
73.75 ± 11% +103.1% 149.75 ± 68% interrupts.CPU115.RES:Rescheduling_interrupts
63.00 ± 8% +121.0% 139.25 ± 52% interrupts.CPU117.RES:Rescheduling_interrupts
59.75 ± 11% +137.7% 142.00 ± 70% interrupts.CPU118.RES:Rescheduling_interrupts
1658 ± 44% +115.0% 3566 ± 50% interrupts.CPU119.CAL:Function_call_interrupts
1700 ± 44% +60.4% 2727 ± 31% interrupts.CPU12.CAL:Function_call_interrupts
1979 ± 36% +202.4% 5986 ± 39% interrupts.CPU148.CAL:Function_call_interrupts
1415 ± 24% -39.1% 862.50 ± 37% interrupts.CPU154.NMI:Non-maskable_interrupts
1415 ± 24% -39.1% 862.50 ± 37% interrupts.CPU154.PMI:Performance_monitoring_interrupts
2737 ± 58% -47.0% 1449 ± 9% interrupts.CPU156.NMI:Non-maskable_interrupts
2737 ± 58% -47.0% 1449 ± 9% interrupts.CPU156.PMI:Performance_monitoring_interrupts
5.50 ± 71% +818.2% 50.50 ± 99% interrupts.CPU166.TLB:TLB_shootdowns
1705 ± 12% -29.8% 1197 ± 20% interrupts.CPU169.NMI:Non-maskable_interrupts
1705 ± 12% -29.8% 1197 ± 20% interrupts.CPU169.PMI:Performance_monitoring_interrupts
1664 ± 44% +51.8% 2527 ± 23% interrupts.CPU18.CAL:Function_call_interrupts
1620 ± 7% -16.6% 1351 ± 6% interrupts.CPU182.NMI:Non-maskable_interrupts
1620 ± 7% -16.6% 1351 ± 6% interrupts.CPU182.PMI:Performance_monitoring_interrupts
1704 ± 11% -20.5% 1354 ± 4% interrupts.CPU188.NMI:Non-maskable_interrupts
1704 ± 11% -20.5% 1354 ± 4% interrupts.CPU188.PMI:Performance_monitoring_interrupts
1662 ± 44% +107.1% 3442 ± 52% interrupts.CPU19.CAL:Function_call_interrupts
1706 ± 10% -25.6% 1269 ± 29% interrupts.CPU191.NMI:Non-maskable_interrupts
1706 ± 10% -25.6% 1269 ± 29% interrupts.CPU191.PMI:Performance_monitoring_interrupts
275.00 ± 63% -55.6% 122.00 ± 40% interrupts.CPU24.RES:Rescheduling_interrupts
205.25 ± 47% -40.1% 123.00 ± 63% interrupts.CPU26.RES:Rescheduling_interrupts
365.00 ± 49% -74.0% 95.00 ± 57% interrupts.CPU27.RES:Rescheduling_interrupts
196.75 ± 42% -54.8% 89.00 ± 47% interrupts.CPU28.RES:Rescheduling_interrupts
1535 ± 25% -40.4% 914.75 ± 37% interrupts.CPU3.NMI:Non-maskable_interrupts
1535 ± 25% -40.4% 914.75 ± 37% interrupts.CPU3.PMI:Performance_monitoring_interrupts
283.75 ± 75% -49.9% 142.25 ± 87% interrupts.CPU30.RES:Rescheduling_interrupts
444.25 ± 83% -61.1% 172.75 ±112% interrupts.CPU35.RES:Rescheduling_interrupts
243.50 ± 77% -59.1% 99.50 ± 71% interrupts.CPU39.RES:Rescheduling_interrupts
1719 ± 45% +81.2% 3115 ± 44% interrupts.CPU4.CAL:Function_call_interrupts
232.00 ± 69% -57.7% 98.25 ± 66% interrupts.CPU41.RES:Rescheduling_interrupts
313.50 ± 84% -71.5% 89.50 ± 69% interrupts.CPU46.RES:Rescheduling_interrupts
7.75 ±121% +638.7% 57.25 ±121% interrupts.CPU52.TLB:TLB_shootdowns
3.75 ± 34% +1753.3% 69.50 ± 60% interrupts.CPU59.TLB:TLB_shootdowns
1887 ± 16% -35.3% 1221 ± 27% interrupts.CPU62.NMI:Non-maskable_interrupts
1887 ± 16% -35.3% 1221 ± 27% interrupts.CPU62.PMI:Performance_monitoring_interrupts
1728 ± 13% -18.2% 1414 ± 7% interrupts.CPU65.NMI:Non-maskable_interrupts
1728 ± 13% -18.2% 1414 ± 7% interrupts.CPU65.PMI:Performance_monitoring_interrupts
2274 ± 45% -38.0% 1411 ± 6% interrupts.CPU67.NMI:Non-maskable_interrupts
2274 ± 45% -38.0% 1411 ± 6% interrupts.CPU67.PMI:Performance_monitoring_interrupts
2155 ± 7% +20.1% 2587 ± 17% interrupts.CPU73.CAL:Function_call_interrupts
120.50 ± 71% -52.1% 57.75 ± 4% interrupts.CPU81.RES:Rescheduling_interrupts
148.50 ± 89% -62.5% 55.75 ± 6% interrupts.CPU82.RES:Rescheduling_interrupts
157.75 ± 86% -58.2% 66.00 ± 17% interrupts.CPU84.RES:Rescheduling_interrupts
2192 ± 8% +26.9% 2782 ± 26% interrupts.CPU87.CAL:Function_call_interrupts
236.50 ±118% -76.8% 54.75 ± 5% interrupts.CPU88.RES:Rescheduling_interrupts
165.25 ±100% -66.4% 55.50 ± 7% interrupts.CPU91.RES:Rescheduling_interrupts
1844 ± 25% +54.6% 2852 ± 34% interrupts.CPU94.CAL:Function_call_interrupts
1798 ± 44% +131.5% 4164 ± 76% interrupts.CPU97.CAL:Function_call_interrupts
962.75 ± 22% +47.1% 1416 ± 5% interrupts.CPU97.NMI:Non-maskable_interrupts
962.75 ± 22% +47.1% 1416 ± 5% interrupts.CPU97.PMI:Performance_monitoring_interrupts
22.80 ± 60% -22.6 0.25 ±173% perf-profile.calltrace.cycles-pp.asm_exc_page_fault
22.75 ± 60% -22.5 0.25 ±173% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault
22.73 ± 60% -22.5 0.25 ±173% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
22.50 ± 60% -22.3 0.24 ±173% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
42.69 ± 12% -17.0 25.71 ± 6% perf-profile.calltrace.cycles-pp.shmem_add_to_page_cache.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
40.06 ± 12% -16.5 23.53 ± 6% perf-profile.calltrace.cycles-pp.mem_cgroup_charge.shmem_add_to_page_cache.shmem_getpage_gfp.shmem_fault.__do_fault
56.69 ± 9% -14.3 42.35 ± 4% perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
56.89 ± 9% -14.3 42.63 ± 4% perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
56.92 ± 9% -14.2 42.68 ± 4% perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
61.30 ± 8% -13.5 47.82 ± 3% perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
61.64 ± 8% -13.5 48.18 ± 3% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
15.26 ± 14% -5.6 9.71 ± 7% perf-profile.calltrace.cycles-pp.get_mem_cgroup_from_mm.mem_cgroup_charge.shmem_add_to_page_cache.shmem_getpage_gfp.shmem_fault
2.72 ± 2% -0.6 2.09 ± 7% perf-profile.calltrace.cycles-pp.__pagevec_lru_add.lru_cache_add.shmem_getpage_gfp.shmem_fault.__do_fault
2.85 ± 2% -0.6 2.25 ± 7% perf-profile.calltrace.cycles-pp.lru_cache_add.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
1.54 ± 7% -0.5 1.07 ± 10% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.lock_page_lruvec_irqsave.__pagevec_lru_add.lru_cache_add.shmem_getpage_gfp
1.49 ± 7% -0.5 1.02 ± 11% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.lock_page_lruvec_irqsave.__pagevec_lru_add.lru_cache_add
1.54 ± 7% -0.5 1.08 ± 10% perf-profile.calltrace.cycles-pp.lock_page_lruvec_irqsave.__pagevec_lru_add.lru_cache_add.shmem_getpage_gfp.shmem_fault
1.42 ± 4% +0.3 1.72 ± 3% perf-profile.calltrace.cycles-pp.shmem_alloc_page.shmem_alloc_and_acct_page.shmem_getpage_gfp.shmem_fault.__do_fault
0.62 ± 19% +0.4 1.06 ± 2% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.shmem_alloc_page.shmem_alloc_and_acct_page
0.12 ±173% +0.4 0.57 ± 2% perf-profile.calltrace.cycles-pp.propagate_protected_usage.page_counter_try_charge.try_charge.mem_cgroup_charge.shmem_add_to_page_cache
1.78 ± 8% +0.5 2.23 ± 4% perf-profile.calltrace.cycles-pp.try_charge.mem_cgroup_charge.shmem_add_to_page_cache.shmem_getpage_gfp.shmem_fault
1.21 ± 23% +0.5 1.68 ± 3% perf-profile.calltrace.cycles-pp.page_counter_try_charge.try_charge.mem_cgroup_charge.shmem_add_to_page_cache.shmem_getpage_gfp
1.03 ± 22% +0.5 1.52 ± 3% perf-profile.calltrace.cycles-pp.alloc_pages_vma.shmem_alloc_page.shmem_alloc_and_acct_page.shmem_getpage_gfp.shmem_fault
1.94 ± 6% +0.5 2.43 ± 4% perf-profile.calltrace.cycles-pp.shmem_alloc_and_acct_page.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
0.00 +0.5 0.53 ± 2% perf-profile.calltrace.cycles-pp.unlock_page.filemap_map_pages.do_fault.__handle_mm_fault.handle_mm_fault
0.00 +0.5 0.53 ± 3% perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma
0.15 ±173% +0.6 0.70 ± 11% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
0.15 ±173% +0.6 0.70 ± 11% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
0.15 ±173% +0.6 0.70 ± 11% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
0.15 ±173% +0.6 0.70 ± 11% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
0.15 ±173% +0.6 0.70 ± 11% perf-profile.calltrace.cycles-pp.__munmap
0.78 ± 18% +0.6 1.35 ± 3% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_vma.shmem_alloc_page.shmem_alloc_and_acct_page.shmem_getpage_gfp
0.17 ±173% +0.7 0.88 ± 3% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.shmem_alloc_page
2.69 ± 3% +0.8 3.52 ± 3% perf-profile.calltrace.cycles-pp.filemap_map_pages.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
4.14 ± 3% +1.3 5.39 ± 5% perf-profile.calltrace.cycles-pp.clear_page_erms.shmem_getpage_gfp.shmem_fault.__do_fault.do_fault
8.12 ± 26% +7.6 15.71 ± 4% perf-profile.calltrace.cycles-pp.do_rw_once
42.71 ± 12% -16.9 25.78 ± 6% perf-profile.children.cycles-pp.shmem_add_to_page_cache
40.19 ± 12% -16.6 23.62 ± 6% perf-profile.children.cycles-pp.mem_cgroup_charge
56.70 ± 9% -14.3 42.40 ± 4% perf-profile.children.cycles-pp.shmem_getpage_gfp
56.90 ± 9% -14.2 42.67 ± 4% perf-profile.children.cycles-pp.shmem_fault
56.93 ± 9% -14.2 42.71 ± 4% perf-profile.children.cycles-pp.__do_fault
61.33 ± 8% -13.4 47.91 ± 3% perf-profile.children.cycles-pp.do_fault
62.71 ± 8% -13.4 49.34 ± 3% perf-profile.children.cycles-pp.handle_mm_fault
61.69 ± 8% -13.4 48.32 ± 3% perf-profile.children.cycles-pp.__handle_mm_fault
63.34 ± 8% -13.2 50.13 ± 3% perf-profile.children.cycles-pp.do_user_addr_fault
63.41 ± 8% -13.2 50.21 ± 3% perf-profile.children.cycles-pp.exc_page_fault
64.44 ± 7% -12.3 52.11 ± 3% perf-profile.children.cycles-pp.asm_exc_page_fault
15.47 ± 13% -5.7 9.78 ± 7% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
2.67 ± 5% -1.6 1.02 ± 5% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
1.73 ± 6% -1.0 0.69 ± 4% perf-profile.children.cycles-pp.__mod_memcg_state
2.77 ± 5% -0.7 2.11 ± 7% perf-profile.children.cycles-pp.__pagevec_lru_add
2.90 ± 5% -0.6 2.27 ± 7% perf-profile.children.cycles-pp.lru_cache_add
1.58 ± 8% -0.5 1.10 ± 10% perf-profile.children.cycles-pp.lock_page_lruvec_irqsave
1.69 ± 8% -0.5 1.23 ± 10% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
1.58 ± 8% -0.4 1.14 ± 10% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.62 ± 24% -0.3 0.31 ± 4% perf-profile.children.cycles-pp.page_remove_rmap
1.13 ± 6% -0.2 0.94 ± 6% perf-profile.children.cycles-pp.__count_memcg_events
1.12 ± 2% -0.1 0.99 ± 3% perf-profile.children.cycles-pp.page_add_file_rmap
0.55 ± 4% -0.1 0.46 ± 7% perf-profile.children.cycles-pp.mem_cgroup_charge_statistics
1.36 ± 3% -0.1 1.27 ± 3% perf-profile.children.cycles-pp.finish_fault
0.11 ± 11% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.obj_cgroup_charge
0.06 ± 9% +0.0 0.07 perf-profile.children.cycles-pp.__might_sleep
0.09 ± 5% +0.0 0.10 ± 4% perf-profile.children.cycles-pp.___might_sleep
0.10 ± 5% +0.0 0.11 ± 4% perf-profile.children.cycles-pp.cap_vm_enough_memory
0.10 ± 10% +0.0 0.13 ± 3% perf-profile.children.cycles-pp.xas_start
0.08 ± 5% +0.0 0.10 ± 8% perf-profile.children.cycles-pp.vmacache_find
0.09 ± 9% +0.0 0.11 ± 4% perf-profile.children.cycles-pp.find_vma
0.09 ± 8% +0.0 0.12 ± 10% perf-profile.children.cycles-pp.xas_find_conflict
0.06 ± 9% +0.0 0.08 ± 5% perf-profile.children.cycles-pp.cgroup_throttle_swaprate
0.10 ± 9% +0.0 0.12 ± 4% perf-profile.children.cycles-pp.percpu_counter_add_batch
0.11 ± 7% +0.0 0.14 ± 5% perf-profile.children.cycles-pp.security_vm_enough_memory_mm
0.12 +0.0 0.16 ± 9% perf-profile.children.cycles-pp.shmem_pseudo_vma_init
0.16 ± 6% +0.0 0.21 ± 5% perf-profile.children.cycles-pp.find_get_entry
0.17 ± 4% +0.0 0.22 ± 5% perf-profile.children.cycles-pp.find_lock_entry
0.00 +0.1 0.05 ± 9% perf-profile.children.cycles-pp.__slab_alloc
0.00 +0.1 0.05 ± 9% perf-profile.children.cycles-pp.___slab_alloc
0.24 ± 5% +0.1 0.30 ± 5% perf-profile.children.cycles-pp.___perf_sw_event
0.01 ±173% +0.1 0.07 ± 34% perf-profile.children.cycles-pp.update_curr
0.29 ± 4% +0.1 0.36 ± 2% perf-profile.children.cycles-pp.xas_find
0.34 ± 9% +0.1 0.42 ± 8% perf-profile.children.cycles-pp.__mod_node_page_state
0.33 ± 4% +0.1 0.42 ± 3% perf-profile.children.cycles-pp.__perf_sw_event
0.29 ± 6% +0.1 0.38 ± 5% perf-profile.children.cycles-pp._raw_spin_lock
0.36 ± 13% +0.1 0.45 ± 3% perf-profile.children.cycles-pp.xas_load
0.41 ± 12% +0.1 0.52 ± 7% perf-profile.children.cycles-pp.__mod_lruvec_state
0.31 ± 24% +0.1 0.43 ± 8% perf-profile.children.cycles-pp.xas_store
0.42 +0.1 0.55 ± 3% perf-profile.children.cycles-pp.rmqueue_bulk
0.21 ± 3% +0.1 0.35 ± 25% perf-profile.children.cycles-pp.task_tick_fair
0.63 +0.1 0.77 ± 4% perf-profile.children.cycles-pp.sync_regs
0.33 ± 5% +0.2 0.49 ± 8% perf-profile.children.cycles-pp.lock_page_memcg
0.75 +0.2 0.92 ± 2% perf-profile.children.cycles-pp.rmqueue
0.51 ± 6% +0.2 0.69 ± 2% perf-profile.children.cycles-pp.unlock_page
0.94 +0.2 1.12 ± 2% perf-profile.children.cycles-pp.get_page_from_freelist
1.31 +0.2 1.55 ± 2% perf-profile.children.cycles-pp.alloc_pages_vma
1.24 +0.2 1.49 ± 3% perf-profile.children.cycles-pp.__alloc_pages_nodemask
1.45 +0.3 1.74 ± 3% perf-profile.children.cycles-pp.shmem_alloc_page
1.40 ± 6% +0.4 1.76 ± 4% perf-profile.children.cycles-pp.page_counter_try_charge
0.27 ± 88% +0.4 0.70 ± 11% perf-profile.children.cycles-pp.__munmap
1.82 ± 6% +0.4 2.26 ± 4% perf-profile.children.cycles-pp.try_charge
1.98 ± 4% +0.5 2.46 ± 3% perf-profile.children.cycles-pp.shmem_alloc_and_acct_page
3.39 ± 6% +0.5 3.90 ± 5% perf-profile.children.cycles-pp.native_irq_return_iret
2.81 ± 2% +0.8 3.65 ± 2% perf-profile.children.cycles-pp.filemap_map_pages
4.27 ± 3% +1.3 5.52 ± 5% perf-profile.children.cycles-pp.clear_page_erms
7.79 ± 27% +7.5 15.33 ± 4% perf-profile.children.cycles-pp.do_rw_once
22.16 ± 13% -11.2 10.93 ± 8% perf-profile.self.cycles-pp.mem_cgroup_charge
15.33 ± 13% -5.7 9.63 ± 8% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
1.72 ± 6% -1.0 0.67 ± 4% perf-profile.self.cycles-pp.__mod_memcg_state
0.94 ± 4% -0.6 0.34 ± 11% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
1.58 ± 8% -0.4 1.14 ± 10% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
1.13 ± 5% -0.2 0.93 ± 7% perf-profile.self.cycles-pp.__count_memcg_events
0.08 ± 5% -0.1 0.03 ±100% perf-profile.self.cycles-pp.obj_cgroup_charge
0.09 ± 5% +0.0 0.10 ± 4% perf-profile.self.cycles-pp.___might_sleep
0.07 +0.0 0.09 ± 7% perf-profile.self.cycles-pp.xas_create
0.11 ± 4% +0.0 0.13 ± 3% perf-profile.self.cycles-pp.asm_exc_page_fault
0.10 +0.0 0.12 ± 6% perf-profile.self.cycles-pp.do_fault
0.09 ± 10% +0.0 0.11 ± 4% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.08 ± 5% +0.0 0.10 ± 7% perf-profile.self.cycles-pp.xas_find_conflict
0.15 ± 5% +0.0 0.17 ± 4% perf-profile.self.cycles-pp.do_user_addr_fault
0.11 ± 4% +0.0 0.13 ± 3% perf-profile.self.cycles-pp.xas_find
0.24 ± 2% +0.0 0.27 perf-profile.self.cycles-pp.rmqueue
0.11 ± 4% +0.0 0.14 ± 11% perf-profile.self.cycles-pp.shmem_pseudo_vma_init
0.08 ± 10% +0.0 0.11 ± 6% perf-profile.self.cycles-pp.page_add_file_rmap
0.03 ±100% +0.0 0.06 perf-profile.self.cycles-pp.cap_vm_enough_memory
0.03 ±100% +0.0 0.07 ± 7% perf-profile.self.cycles-pp.cgroup_throttle_swaprate
0.19 ± 6% +0.0 0.23 ± 4% perf-profile.self.cycles-pp.___perf_sw_event
0.23 ± 4% +0.0 0.27 ± 4% perf-profile.self.cycles-pp._raw_spin_lock
0.15 ± 5% +0.1 0.20 ± 7% perf-profile.self.cycles-pp.__alloc_pages_nodemask
0.01 ±173% +0.1 0.07 ± 24% perf-profile.self.cycles-pp.task_tick_fair
0.22 ± 3% +0.1 0.28 perf-profile.self.cycles-pp.handle_mm_fault
0.28 +0.1 0.35 ± 5% perf-profile.self.cycles-pp.__handle_mm_fault
0.22 ± 5% +0.1 0.28 ± 3% perf-profile.self.cycles-pp.alloc_set_pte
0.29 ± 3% +0.1 0.36 ± 2% perf-profile.self.cycles-pp.rmqueue_bulk
0.20 ± 16% +0.1 0.27 ± 10% perf-profile.self.cycles-pp.shmem_fault
0.16 ± 13% +0.1 0.23 ± 6% perf-profile.self.cycles-pp.xas_store
0.27 ± 13% +0.1 0.34 ± 5% perf-profile.self.cycles-pp.xas_load
0.33 ± 9% +0.1 0.41 ± 8% perf-profile.self.cycles-pp.__mod_node_page_state
0.38 ± 2% +0.1 0.46 ± 8% perf-profile.self.cycles-pp.__pagevec_lru_add
0.22 ± 7% +0.1 0.31 ± 2% perf-profile.self.cycles-pp.shmem_add_to_page_cache
0.45 ± 4% +0.1 0.55 ± 6% perf-profile.self.cycles-pp.try_charge
0.62 +0.1 0.76 ± 4% perf-profile.self.cycles-pp.sync_regs
0.33 ± 4% +0.2 0.48 ± 8% perf-profile.self.cycles-pp.lock_page_memcg
0.49 ± 7% +0.2 0.65 ± 2% perf-profile.self.cycles-pp.unlock_page
0.20 ± 16% +0.2 0.39 ± 12% perf-profile.self.cycles-pp.shmem_alloc_and_acct_page
0.83 ± 6% +0.3 1.16 ± 5% perf-profile.self.cycles-pp.page_counter_try_charge
3.39 ± 6% +0.5 3.90 ± 5% perf-profile.self.cycles-pp.native_irq_return_iret
1.89 +0.6 2.45 ± 2% perf-profile.self.cycles-pp.filemap_map_pages
4.24 ± 3% +1.2 5.45 ± 5% perf-profile.self.cycles-pp.clear_page_erms
4.46 ± 2% +1.4 5.86 ± 5% perf-profile.self.cycles-pp.shmem_getpage_gfp
4.07 ± 26% +3.7 7.75 ± 3% perf-profile.self.cycles-pp.do_access
6.29 ± 28% +6.2 12.50 ± 3% perf-profile.self.cycles-pp.do_rw_once
vm-scalability.throughput
1.2e+08 +-----------------------------------------------------------------+
| O |
1e+08 |-+ O O O O O O O O O O O O |
| |
| O O O O O O O O |
8e+07 |-+ |
|. .+.+..+. .+.+..+.+. .+.+..+. .+.+.. .+.. |
6e+07 |-+..+ +..+ +..+ +.+. +. .+..+.+ +.|
| + |
4e+07 |-+ |
| |
| |
2e+07 |-+ |
| |
0 +-----------------------------------------------------------------+
vm-scalability.free_time
0.025 +-------------------------------------------------------------------+
|.+.. .+.+.+.. .+.. .+.. |
| +.+..+.+. .+..+ +.+.+..+ +.+..+.+.+..+. .+. .+.|
0.02 |-+ + +. +. |
| |
| |
0.015 |-+ |
| O O O O O O O O O O O O O O O O O O O O O |
0.01 |-+ |
| |
| |
0.005 |-+ |
| |
| |
0 +-------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Oliver Sang
View attachment "config-5.11.0-rc1-00010-g4d8191276e02" of type "text/plain" (172412 bytes)
View attachment "job-script" of type "text/plain" (7928 bytes)
View attachment "job.yaml" of type "text/plain" (5306 bytes)
View attachment "reproduce" of type "text/plain" (946828 bytes)
Powered by blists - more mailing lists