[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aT_8woTbtklin3Bh@milan>
Date: Mon, 15 Dec 2025 13:19:14 +0100
From: Uladzislau Rezki <urezki@...il.com>
To: kernel test robot <oliver.sang@...el.com>
Cc: Uladzislau Rezki <urezki@...il.com>, oe-lkp@...ts.linux.dev,
lkp@...el.com, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...e.com>, Baoquan He <bhe@...hat.com>,
Alexander Potapenko <glider@...gle.com>,
Andrey Ryabinin <ryabinin.a.a@...il.com>,
Marco Elver <elver@...gle.com>, Michal Hocko <mhocko@...nel.org>,
linux-mm@...ck.org
Subject: Re: [linus:master] [mm/vmalloc] 9c47753167:
stress-ng.bigheap.realloc_calls_per_sec 21.3% regression
On Fri, Dec 12, 2025 at 11:27:27AM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed a 21.3% regression of stress-ng.bigheap.realloc_calls_per_sec on:
>
>
> commit: 9c47753167a6a585d0305663c6912f042e131c2d ("mm/vmalloc: defer freeing partly initialized vm_struct")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> [still regression on linus/master c9b47175e9131118e6f221cc8fb81397d62e7c91]
> [still regression on linux-next/master 008d3547aae5bc86fac3eda317489169c3fda112]
>
> testcase: stress-ng
> config: x86_64-rhel-9.4
> compiler: gcc-14
> test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P CPU @ 2.4GHz (Granite Rapids) with 256G memory
> parameters:
>
> nr_threads: 100%
> testtime: 60s
> test: bigheap
> cpufreq_governor: performance
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@...el.com>
> | Closes: https://lore.kernel.org/oe-lkp/202512121138.986f6a6b-lkp@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20251212/202512121138.986f6a6b-lkp@intel.com
>
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-gnr-2sp3/bigheap/stress-ng/60s
>
> commit:
> 86e968d8ca ("mm/vmalloc: support non-blocking GFP flags in alloc_vmap_area()")
> 9c47753167 ("mm/vmalloc: defer freeing partly initialized vm_struct")
>
> 86e968d8ca6dc823 9c47753167a6a585d0305663c69
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 209109 ± 5% -14.1% 179718 ± 6% numa-meminfo.node0.PageTables
> 1278595 ± 7% -10.4% 1145748 ± 2% sched_debug.cpu.max_idle_balance_cost.max
> 33.90 -3.6% 32.67 turbostat.RAMWatt
> 3.885e+08 -10.9% 3.463e+08 numa-numastat.node0.local_node
> 3.886e+08 -10.8% 3.466e+08 numa-numastat.node0.numa_hit
> 3.881e+08 -10.9% 3.46e+08 numa-numastat.node1.local_node
> 3.883e+08 -10.9% 3.461e+08 numa-numastat.node1.numa_hit
> 3.886e+08 -10.8% 3.466e+08 numa-vmstat.node0.numa_hit
> 3.885e+08 -10.9% 3.463e+08 numa-vmstat.node0.numa_local
> 3.883e+08 -10.9% 3.461e+08 numa-vmstat.node1.numa_hit
> 3.881e+08 -10.9% 3.46e+08 numa-vmstat.node1.numa_local
> 48320196 -10.9% 43072080 stress-ng.bigheap.ops
> 785159 -9.8% 708390 stress-ng.bigheap.ops_per_sec
> 879805 -21.3% 692805 stress-ng.bigheap.realloc_calls_per_sec
> 72414 -3.3% 70043 stress-ng.time.involuntary_context_switches
> 7.735e+08 -10.9% 6.895e+08 stress-ng.time.minor_page_faults
> 15385 -1.0% 15224 stress-ng.time.system_time
> 236.00 -10.5% 211.19 ± 2% stress-ng.time.user_time
> 0.32 ± 4% +95.1% 0.63 ± 12% perf-sched.sch_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
> 16.96 ± 41% +5031.1% 870.26 ± 40% perf-sched.sch_delay.max.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
> 0.32 ± 4% +95.1% 0.63 ± 12% perf-sched.total_sch_delay.average.ms
> 16.96 ± 41% +5031.1% 870.26 ± 40% perf-sched.total_sch_delay.max.ms
> 4750 ± 4% -12.2% 4169 ± 4% perf-sched.total_wait_and_delay.max.ms
> 4750 ± 4% -12.2% 4169 ± 4% perf-sched.total_wait_time.max.ms
> 4750 ± 4% -12.2% 4169 ± 4% perf-sched.wait_and_delay.max.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
> 4750 ± 4% -12.2% 4169 ± 4% perf-sched.wait_time.max.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
> 29568942 -2.9% 28712561 proc-vmstat.nr_active_anon
> 28797015 -2.8% 27991137 proc-vmstat.nr_anon_pages
> 99294 -3.7% 95669 proc-vmstat.nr_page_table_pages
> 29568950 -2.9% 28712562 proc-vmstat.nr_zone_active_anon
> 7.77e+08 -10.9% 6.927e+08 proc-vmstat.numa_hit
> 7.766e+08 -10.9% 6.923e+08 proc-vmstat.numa_local
> 7.785e+08 -10.8% 6.941e+08 proc-vmstat.pgalloc_normal
> 7.739e+08 -10.8% 6.899e+08 proc-vmstat.pgfault
> 7.756e+08 -10.6% 6.931e+08 proc-vmstat.pgfree
> 7.68 -3.8% 7.39 perf-stat.i.MPKI
> 2.811e+10 -4.9% 2.672e+10 perf-stat.i.branch-instructions
> 0.06 -0.0 0.05 perf-stat.i.branch-miss-rate%
> 15424402 -14.3% 13220241 perf-stat.i.branch-misses
> 80.75 -2.3 78.42 perf-stat.i.cache-miss-rate%
> 1.037e+09 -11.0% 9.233e+08 perf-stat.i.cache-misses
> 1.217e+09 -10.6% 1.088e+09 perf-stat.i.cache-references
> 2817 ± 2% -2.8% 2739 perf-stat.i.context-switches
> 7.16 +5.1% 7.53 perf-stat.i.cpi
> 1846 ± 5% +30.6% 2410 ± 5% perf-stat.i.cycles-between-cache-misses
> 1.298e+11 -5.9% 1.222e+11 perf-stat.i.instructions
> 0.14 -5.2% 0.13 perf-stat.i.ipc
> 103.98 -9.7% 93.94 perf-stat.i.metric.K/sec
> 13534286 -11.0% 12040965 perf-stat.i.minor-faults
> 13534286 -11.0% 12040965 perf-stat.i.page-faults
> 7.64 -5.3% 7.23 perf-stat.overall.MPKI
> 0.05 -0.0 0.05 perf-stat.overall.branch-miss-rate%
> 7.20 +5.3% 7.58 perf-stat.overall.cpi
> 942.28 +11.2% 1047 perf-stat.overall.cycles-between-cache-misses
> 0.14 -5.0% 0.13 perf-stat.overall.ipc
> 2.678e+10 -4.1% 2.569e+10 perf-stat.ps.branch-instructions
> 14559650 -13.3% 12627015 perf-stat.ps.branch-misses
> 9.434e+08 -10.0% 8.491e+08 perf-stat.ps.cache-misses
> 1.112e+09 -9.5% 1.006e+09 perf-stat.ps.cache-references
> 1.235e+11 -4.9% 1.174e+11 perf-stat.ps.instructions
> 12270397 -10.0% 11048367 perf-stat.ps.minor-faults
> 12270398 -10.0% 11048367 perf-stat.ps.page-faults
> 7.755e+12 -5.9% 7.3e+12 perf-stat.total.instructions
> 42.85 -5.2 37.62 perf-profile.calltrace.cycles-pp.__munmap
> 42.85 -5.2 37.62 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
> 42.85 -5.2 37.62 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
> 42.85 -5.2 37.62 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
> 42.85 -5.2 37.62 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
> 42.85 -5.2 37.62 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
> 42.85 -5.2 37.62 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 42.85 -5.2 37.62 perf-profile.calltrace.cycles-pp.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
> 42.85 -5.2 37.62 perf-profile.calltrace.cycles-pp.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
> 41.78 ± 2% -5.1 36.70 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap
> 41.78 ± 2% -5.1 36.70 perf-profile.calltrace.cycles-pp.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap
> 41.78 ± 2% -5.1 36.70 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas
> 41.78 ± 2% -5.1 36.70 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.vms_clear_ptes
> 41.51 ± 2% -5.1 36.45 perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
> 41.51 ± 2% -5.1 36.45 perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
> 41.51 ± 2% -5.1 36.45 perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
> 41.65 -5.1 36.60 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache
> 41.63 -5.1 36.58 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs
> 41.65 -5.1 36.60 perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
> 41.46 ± 2% -5.0 36.41 perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
> 40.84 ± 2% -4.9 35.90 perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
> 3.89 ± 4% -2.4 1.53 ± 8% perf-profile.calltrace.cycles-pp.si_meminfo.do_sysinfo.__do_sys_sysinfo.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 3.84 ± 4% -2.4 1.49 ± 8% perf-profile.calltrace.cycles-pp.nr_blockdev_pages.si_meminfo.do_sysinfo.__do_sys_sysinfo.do_syscall_64
> 3.82 ± 4% -2.3 1.47 ± 9% perf-profile.calltrace.cycles-pp._raw_spin_lock.nr_blockdev_pages.si_meminfo.do_sysinfo.__do_sys_sysinfo
> 3.74 ± 4% -2.3 1.43 ± 9% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.nr_blockdev_pages.si_meminfo.do_sysinfo
> 3.10 ± 2% -0.6 2.45 ± 2% perf-profile.calltrace.cycles-pp.alloc_anon_folio.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
> 1.90 -0.4 1.52 perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page.__handle_mm_fault.handle_mm_fault
> 1.84 -0.4 1.48 perf-profile.calltrace.cycles-pp.alloc_pages_mpol.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page.__handle_mm_fault
> 1.80 -0.4 1.44 perf-profile.calltrace.cycles-pp.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page
> 1.70 -0.4 1.36 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof.alloc_anon_folio
> 1.43 ± 6% -0.3 1.12 ± 2% perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
> 1.26 ± 4% -0.3 0.98 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.do_anonymous_page.__handle_mm_fault.handle_mm_fault
> 1.21 -0.3 0.95 perf-profile.calltrace.cycles-pp.prep_new_page.get_page_from_freelist.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof
> 1.16 ± 8% -0.3 0.90 ± 5% perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.alloc_anon_folio.do_anonymous_page.__handle_mm_fault.handle_mm_fault
> 1.17 -0.3 0.92 perf-profile.calltrace.cycles-pp.clear_page_erms.prep_new_page.get_page_from_freelist.__alloc_frozen_pages_noprof.alloc_pages_mpol
> 44.15 ± 2% +7.5 51.61 ± 2% perf-profile.calltrace.cycles-pp.do_sysinfo.__do_sys_sysinfo.do_syscall_64.entry_SYSCALL_64_after_hwframe.sysinfo
> 44.32 ± 2% +7.5 51.79 ± 2% perf-profile.calltrace.cycles-pp.sysinfo
> 44.30 ± 2% +7.5 51.77 ± 2% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sysinfo
> 44.30 ± 2% +7.5 51.77 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sysinfo
> 44.28 ± 2% +7.5 51.75 ± 2% perf-profile.calltrace.cycles-pp.__do_sys_sysinfo.do_syscall_64.entry_SYSCALL_64_after_hwframe.sysinfo
> 40.25 ± 2% +9.8 50.06 ± 2% perf-profile.calltrace.cycles-pp.si_swapinfo.do_sysinfo.__do_sys_sysinfo.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 40.24 ± 2% +9.8 50.06 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock.si_swapinfo.do_sysinfo.__do_sys_sysinfo.do_syscall_64
> 40.08 ± 2% +9.8 49.92 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.si_swapinfo.do_sysinfo.__do_sys_sysinfo
> 44.76 ± 2% -6.0 38.80 ± 4% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> 44.44 ± 2% -5.9 38.56 ± 4% perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
> 42.85 -5.2 37.62 perf-profile.children.cycles-pp.__munmap
> 42.85 -5.2 37.62 perf-profile.children.cycles-pp.__vm_munmap
> 42.85 -5.2 37.62 perf-profile.children.cycles-pp.__x64_sys_munmap
> 42.88 -5.2 37.65 perf-profile.children.cycles-pp.do_vmi_align_munmap
> 42.88 -5.2 37.65 perf-profile.children.cycles-pp.vms_clear_ptes
> 42.88 -5.2 37.65 perf-profile.children.cycles-pp.vms_complete_munmap_vmas
> 42.86 -5.2 37.64 perf-profile.children.cycles-pp.do_vmi_munmap
> 42.62 -5.2 37.40 perf-profile.children.cycles-pp.folios_put_refs
> 42.60 -5.2 37.40 perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
> 42.60 -5.2 37.40 perf-profile.children.cycles-pp.free_pages_and_swap_cache
> 41.93 -5.1 36.84 perf-profile.children.cycles-pp.__page_cache_release
> 41.80 ± 2% -5.1 36.72 perf-profile.children.cycles-pp.unmap_page_range
> 41.80 ± 2% -5.1 36.72 perf-profile.children.cycles-pp.unmap_vmas
> 41.80 ± 2% -5.1 36.72 perf-profile.children.cycles-pp.zap_pmd_range
> 41.80 ± 2% -5.1 36.72 perf-profile.children.cycles-pp.zap_pte_range
> 41.51 ± 2% -5.1 36.45 perf-profile.children.cycles-pp.tlb_flush_mmu
> 3.89 ± 4% -2.4 1.53 ± 8% perf-profile.children.cycles-pp.si_meminfo
> 3.84 ± 4% -2.4 1.49 ± 8% perf-profile.children.cycles-pp.nr_blockdev_pages
> 3.11 ± 2% -0.6 2.46 ± 2% perf-profile.children.cycles-pp.alloc_anon_folio
> 1.90 -0.4 1.52 perf-profile.children.cycles-pp.vma_alloc_folio_noprof
> 1.89 -0.4 1.52 perf-profile.children.cycles-pp.alloc_pages_mpol
> 1.84 -0.4 1.48 perf-profile.children.cycles-pp.__alloc_frozen_pages_noprof
> 1.73 -0.3 1.39 perf-profile.children.cycles-pp.get_page_from_freelist
> 0.56 ± 72% -0.3 0.22 ±108% perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
> 1.45 ± 6% -0.3 1.14 ± 3% perf-profile.children.cycles-pp.__pte_offset_map_lock
> 1.22 -0.3 0.96 perf-profile.children.cycles-pp.prep_new_page
> 1.16 ± 7% -0.3 0.90 ± 5% perf-profile.children.cycles-pp.__mem_cgroup_charge
> 1.19 -0.3 0.93 perf-profile.children.cycles-pp.clear_page_erms
> 0.26 ± 8% -0.1 0.16 ± 3% perf-profile.children.cycles-pp.handle_internal_command
> 0.26 ± 8% -0.1 0.16 ± 3% perf-profile.children.cycles-pp.main
> 0.26 ± 8% -0.1 0.16 ± 3% perf-profile.children.cycles-pp.run_builtin
> 0.44 ± 10% -0.1 0.35 ± 6% perf-profile.children.cycles-pp.free_unref_folios
> 0.25 ± 9% -0.1 0.16 ± 3% perf-profile.children.cycles-pp.record__mmap_read_evlist
> 0.40 ± 11% -0.1 0.31 ± 6% perf-profile.children.cycles-pp.free_frozen_page_commit
> 0.24 ± 8% -0.1 0.16 ± 4% perf-profile.children.cycles-pp.perf_mmap__push
> 0.38 ± 13% -0.1 0.30 ± 7% perf-profile.children.cycles-pp.free_pcppages_bulk
> 0.55 -0.1 0.48 perf-profile.children.cycles-pp.sync_regs
> 0.48 ± 4% -0.1 0.42 ± 2% perf-profile.children.cycles-pp.native_irq_return_iret
> 0.37 ± 4% -0.1 0.31 ± 3% perf-profile.children.cycles-pp.rmqueue
> 0.35 ± 4% -0.1 0.30 ± 3% perf-profile.children.cycles-pp.rmqueue_pcplist
> 0.19 ± 6% -0.0 0.14 ± 3% perf-profile.children.cycles-pp.record__pushfn
> 0.18 ± 7% -0.0 0.13 ± 2% perf-profile.children.cycles-pp.ksys_write
> 0.17 ± 5% -0.0 0.13 ± 3% perf-profile.children.cycles-pp.vfs_write
> 0.28 ± 5% -0.0 0.24 ± 3% perf-profile.children.cycles-pp.__rmqueue_pcplist
> 0.31 -0.0 0.27 perf-profile.children.cycles-pp.lru_add
> 0.16 ± 5% -0.0 0.12 ± 3% perf-profile.children.cycles-pp.shmem_file_write_iter
> 0.24 ± 6% -0.0 0.20 ± 5% perf-profile.children.cycles-pp.rmqueue_bulk
> 0.16 ± 4% -0.0 0.12 ± 3% perf-profile.children.cycles-pp.generic_perform_write
> 0.24 ± 2% -0.0 0.20 perf-profile.children.cycles-pp.lru_gen_add_folio
> 0.21 -0.0 0.18 perf-profile.children.cycles-pp.lru_gen_del_folio
> 0.25 ± 2% -0.0 0.22 perf-profile.children.cycles-pp.zap_present_ptes
> 0.14 ± 2% -0.0 0.12 ± 3% perf-profile.children.cycles-pp.lock_vma_under_rcu
> 0.14 ± 3% -0.0 0.12 ± 4% perf-profile.children.cycles-pp.__mod_node_page_state
> 0.13 -0.0 0.12 ± 4% perf-profile.children.cycles-pp.__perf_sw_event
> 0.06 ± 7% -0.0 0.05 perf-profile.children.cycles-pp.___pte_offset_map
> 0.09 ± 5% -0.0 0.08 perf-profile.children.cycles-pp.__mem_cgroup_uncharge_folios
> 0.08 ± 6% -0.0 0.06 ± 6% perf-profile.children.cycles-pp.vma_merge_extend
> 0.11 ± 3% -0.0 0.10 perf-profile.children.cycles-pp.__free_one_page
> 0.07 -0.0 0.06 perf-profile.children.cycles-pp.error_entry
> 0.06 -0.0 0.05 perf-profile.children.cycles-pp.__mod_zone_page_state
> 0.11 -0.0 0.10 perf-profile.children.cycles-pp.___perf_sw_event
> 0.10 ± 4% +0.0 0.11 ± 4% perf-profile.children.cycles-pp.sched_tick
> 0.21 ± 3% +0.0 0.24 ± 5% perf-profile.children.cycles-pp.update_process_times
> 0.22 ± 3% +0.0 0.26 ± 7% perf-profile.children.cycles-pp.tick_nohz_handler
> 0.30 ± 4% +0.0 0.34 ± 6% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
> 0.29 ± 4% +0.0 0.33 ± 6% perf-profile.children.cycles-pp.hrtimer_interrupt
> 0.39 ± 2% +0.0 0.43 ± 2% perf-profile.children.cycles-pp.mremap
> 0.31 ± 4% +0.0 0.36 ± 5% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
> 0.34 ± 3% +0.0 0.39 ± 5% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
> 0.28 ± 3% +0.1 0.34 ± 2% perf-profile.children.cycles-pp.__do_sys_mremap
> 0.28 ± 2% +0.1 0.34 ± 3% perf-profile.children.cycles-pp.do_mremap
> 0.11 ± 4% +0.1 0.17 ± 2% perf-profile.children.cycles-pp.expand_vma
> 0.00 +0.1 0.08 perf-profile.children.cycles-pp.__vm_enough_memory
> 0.00 +0.1 0.09 ± 5% perf-profile.children.cycles-pp.vrm_calc_charge
> 0.04 ±141% +0.1 0.13 ± 16% perf-profile.children.cycles-pp.add_callchain_ip
> 0.04 ±142% +0.1 0.14 ± 17% perf-profile.children.cycles-pp.thread__resolve_callchain_sample
> 0.04 ±142% +0.1 0.17 ± 15% perf-profile.children.cycles-pp.__thread__resolve_callchain
> 0.04 ±142% +0.1 0.18 ± 15% perf-profile.children.cycles-pp.sample__for_each_callchain_node
> 0.05 ±141% +0.1 0.18 ± 14% perf-profile.children.cycles-pp.build_id__mark_dso_hit
> 0.05 ±141% +0.1 0.19 ± 14% perf-profile.children.cycles-pp.perf_session__deliver_event
> 0.05 ±141% +0.1 0.20 ± 14% perf-profile.children.cycles-pp.__ordered_events__flush
> 0.05 ±141% +0.1 0.20 ± 33% perf-profile.children.cycles-pp.perf_session__process_events
> 0.05 ±141% +0.1 0.20 ± 33% perf-profile.children.cycles-pp.record__finish_output
> 88.59 +1.5 90.13 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> 45.34 ± 2% +7.2 52.54 ± 2% perf-profile.children.cycles-pp._raw_spin_lock
> 44.15 ± 2% +7.5 51.61 ± 2% perf-profile.children.cycles-pp.do_sysinfo
> 44.33 ± 2% +7.5 51.80 ± 2% perf-profile.children.cycles-pp.sysinfo
> 44.28 ± 2% +7.5 51.75 ± 2% perf-profile.children.cycles-pp.__do_sys_sysinfo
> 40.25 ± 2% +9.8 50.07 ± 2% perf-profile.children.cycles-pp.si_swapinfo
> 0.55 ± 74% -0.3 0.22 ±107% perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
> 1.50 ± 4% -0.3 1.17 perf-profile.self.cycles-pp._raw_spin_lock
> 1.18 -0.3 0.92 perf-profile.self.cycles-pp.clear_page_erms
> 2.01 -0.2 1.86 ± 3% perf-profile.self.cycles-pp.stress_bigheap_child
> 0.55 -0.1 0.48 perf-profile.self.cycles-pp.sync_regs
> 0.48 ± 4% -0.1 0.42 ± 2% perf-profile.self.cycles-pp.native_irq_return_iret
> 0.14 ± 3% -0.0 0.12 ± 4% perf-profile.self.cycles-pp.get_page_from_freelist
> 0.14 ± 8% -0.0 0.12 ± 3% perf-profile.self.cycles-pp.do_anonymous_page
> 0.14 ± 2% -0.0 0.12 ± 3% perf-profile.self.cycles-pp.rmqueue_bulk
> 0.14 -0.0 0.12 perf-profile.self.cycles-pp.lru_gen_del_folio
> 0.11 ± 3% -0.0 0.09 ± 4% perf-profile.self.cycles-pp.__handle_mm_fault
> 0.15 ± 2% -0.0 0.13 perf-profile.self.cycles-pp.lru_gen_add_folio
> 0.12 ± 3% -0.0 0.10 ± 3% perf-profile.self.cycles-pp.zap_present_ptes
> 0.12 ± 4% -0.0 0.11 perf-profile.self.cycles-pp.__mod_node_page_state
> 0.07 ± 6% -0.0 0.06 perf-profile.self.cycles-pp.lock_vma_under_rcu
> 0.10 -0.0 0.09 ± 4% perf-profile.self.cycles-pp.__free_one_page
> 0.11 ± 3% -0.0 0.10 perf-profile.self.cycles-pp.folios_put_refs
> 0.07 -0.0 0.06 perf-profile.self.cycles-pp.___perf_sw_event
> 0.07 -0.0 0.06 perf-profile.self.cycles-pp.do_user_addr_fault
> 0.07 -0.0 0.06 perf-profile.self.cycles-pp.lru_add
> 0.07 -0.0 0.06 perf-profile.self.cycles-pp.mas_walk
> 0.08 -0.0 0.07 perf-profile.self.cycles-pp.__alloc_frozen_pages_noprof
> 0.06 -0.0 0.05 perf-profile.self.cycles-pp.handle_mm_fault
> 0.06 -0.0 0.05 perf-profile.self.cycles-pp.page_counter_uncharge
> 0.00 +0.1 0.08 perf-profile.self.cycles-pp.__vm_enough_memory
> 88.36 +1.5 89.85 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>
Could you please test below patch and confirm if it solves regression:
<snip>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ecbac900c35f..118de1a8348c 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3746,6 +3746,15 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
return nr_allocated;
}
+static void
+__vm_area_cleanup(struct vm_struct *area)
+{
+ if (area->pages)
+ vfree(area->addr);
+ else
+ free_vm_area(area);
+}
+
static LLIST_HEAD(pending_vm_area_cleanup);
static void cleanup_vm_area_work(struct work_struct *work)
{
@@ -3756,12 +3765,8 @@ static void cleanup_vm_area_work(struct work_struct *work)
if (!head)
return;
- llist_for_each_entry_safe(area, tmp, head, llnode) {
- if (!area->pages)
- free_vm_area(area);
- else
- vfree(area->addr);
- }
+ llist_for_each_entry_safe(area, tmp, head, llnode)
+ __vm_area_cleanup(area);
}
/*
@@ -3769,8 +3774,11 @@ static void cleanup_vm_area_work(struct work_struct *work)
* of partially initialized vm_struct in error paths.
*/
static DECLARE_WORK(cleanup_vm_area, cleanup_vm_area_work);
-static void defer_vm_area_cleanup(struct vm_struct *area)
+static void vm_area_cleanup(struct vm_struct *area, bool can_block)
{
+ if (can_block)
+ return __vm_area_cleanup(area);
+
if (llist_add(&area->llnode, &pending_vm_area_cleanup))
schedule_work(&cleanup_vm_area);
}
@@ -3915,7 +3923,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
return area->addr;
fail:
- defer_vm_area_cleanup(area);
+ vm_area_cleanup(area, gfpflags_allow_blocking(gfp_mask));
return NULL;
}
<snip>
--
Uladzislau Rezki
Powered by blists - more mailing lists