lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aT_8woTbtklin3Bh@milan>
Date: Mon, 15 Dec 2025 13:19:14 +0100
From: Uladzislau Rezki <urezki@...il.com>
To: kernel test robot <oliver.sang@...el.com>
Cc: Uladzislau Rezki <urezki@...il.com>, oe-lkp@...ts.linux.dev,
	lkp@...el.com, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Michal Hocko <mhocko@...e.com>, Baoquan He <bhe@...hat.com>,
	Alexander Potapenko <glider@...gle.com>,
	Andrey Ryabinin <ryabinin.a.a@...il.com>,
	Marco Elver <elver@...gle.com>, Michal Hocko <mhocko@...nel.org>,
	linux-mm@...ck.org
Subject: Re: [linus:master] [mm/vmalloc]  9c47753167:
 stress-ng.bigheap.realloc_calls_per_sec 21.3% regression

On Fri, Dec 12, 2025 at 11:27:27AM +0800, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed a 21.3% regression of stress-ng.bigheap.realloc_calls_per_sec on:
> 
> 
> commit: 9c47753167a6a585d0305663c6912f042e131c2d ("mm/vmalloc: defer freeing partly initialized vm_struct")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> [still regression on linus/master      c9b47175e9131118e6f221cc8fb81397d62e7c91]
> [still regression on linux-next/master 008d3547aae5bc86fac3eda317489169c3fda112]
> 
> testcase: stress-ng
> config: x86_64-rhel-9.4
> compiler: gcc-14
> test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P  CPU @ 2.4GHz (Granite Rapids) with 256G memory
> parameters:
> 
> 	nr_threads: 100%
> 	testtime: 60s
> 	test: bigheap
> 	cpufreq_governor: performance
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@...el.com>
> | Closes: https://lore.kernel.org/oe-lkp/202512121138.986f6a6b-lkp@intel.com
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20251212/202512121138.986f6a6b-lkp@intel.com
> 
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
>   gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-gnr-2sp3/bigheap/stress-ng/60s
> 
> commit: 
>   86e968d8ca ("mm/vmalloc: support non-blocking GFP flags in alloc_vmap_area()")
>   9c47753167 ("mm/vmalloc: defer freeing partly initialized vm_struct")
> 
> 86e968d8ca6dc823 9c47753167a6a585d0305663c69 
> ---------------- --------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>     209109 ±  5%     -14.1%     179718 ±  6%  numa-meminfo.node0.PageTables
>    1278595 ±  7%     -10.4%    1145748 ±  2%  sched_debug.cpu.max_idle_balance_cost.max
>      33.90            -3.6%      32.67        turbostat.RAMWatt
>  3.885e+08           -10.9%  3.463e+08        numa-numastat.node0.local_node
>  3.886e+08           -10.8%  3.466e+08        numa-numastat.node0.numa_hit
>  3.881e+08           -10.9%   3.46e+08        numa-numastat.node1.local_node
>  3.883e+08           -10.9%  3.461e+08        numa-numastat.node1.numa_hit
>  3.886e+08           -10.8%  3.466e+08        numa-vmstat.node0.numa_hit
>  3.885e+08           -10.9%  3.463e+08        numa-vmstat.node0.numa_local
>  3.883e+08           -10.9%  3.461e+08        numa-vmstat.node1.numa_hit
>  3.881e+08           -10.9%   3.46e+08        numa-vmstat.node1.numa_local
>   48320196           -10.9%   43072080        stress-ng.bigheap.ops
>     785159            -9.8%     708390        stress-ng.bigheap.ops_per_sec
>     879805           -21.3%     692805        stress-ng.bigheap.realloc_calls_per_sec
>      72414            -3.3%      70043        stress-ng.time.involuntary_context_switches
>  7.735e+08           -10.9%  6.895e+08        stress-ng.time.minor_page_faults
>      15385            -1.0%      15224        stress-ng.time.system_time
>     236.00           -10.5%     211.19 ±  2%  stress-ng.time.user_time
>       0.32 ±  4%     +95.1%       0.63 ± 12%  perf-sched.sch_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
>      16.96 ± 41%   +5031.1%     870.26 ± 40%  perf-sched.sch_delay.max.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
>       0.32 ±  4%     +95.1%       0.63 ± 12%  perf-sched.total_sch_delay.average.ms
>      16.96 ± 41%   +5031.1%     870.26 ± 40%  perf-sched.total_sch_delay.max.ms
>       4750 ±  4%     -12.2%       4169 ±  4%  perf-sched.total_wait_and_delay.max.ms
>       4750 ±  4%     -12.2%       4169 ±  4%  perf-sched.total_wait_time.max.ms
>       4750 ±  4%     -12.2%       4169 ±  4%  perf-sched.wait_and_delay.max.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
>       4750 ±  4%     -12.2%       4169 ±  4%  perf-sched.wait_time.max.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
>   29568942            -2.9%   28712561        proc-vmstat.nr_active_anon
>   28797015            -2.8%   27991137        proc-vmstat.nr_anon_pages
>      99294            -3.7%      95669        proc-vmstat.nr_page_table_pages
>   29568950            -2.9%   28712562        proc-vmstat.nr_zone_active_anon
>   7.77e+08           -10.9%  6.927e+08        proc-vmstat.numa_hit
>  7.766e+08           -10.9%  6.923e+08        proc-vmstat.numa_local
>  7.785e+08           -10.8%  6.941e+08        proc-vmstat.pgalloc_normal
>  7.739e+08           -10.8%  6.899e+08        proc-vmstat.pgfault
>  7.756e+08           -10.6%  6.931e+08        proc-vmstat.pgfree
>       7.68            -3.8%       7.39        perf-stat.i.MPKI
>  2.811e+10            -4.9%  2.672e+10        perf-stat.i.branch-instructions
>       0.06            -0.0        0.05        perf-stat.i.branch-miss-rate%
>   15424402           -14.3%   13220241        perf-stat.i.branch-misses
>      80.75            -2.3       78.42        perf-stat.i.cache-miss-rate%
>  1.037e+09           -11.0%  9.233e+08        perf-stat.i.cache-misses
>  1.217e+09           -10.6%  1.088e+09        perf-stat.i.cache-references
>       2817 ±  2%      -2.8%       2739        perf-stat.i.context-switches
>       7.16            +5.1%       7.53        perf-stat.i.cpi
>       1846 ±  5%     +30.6%       2410 ±  5%  perf-stat.i.cycles-between-cache-misses
>  1.298e+11            -5.9%  1.222e+11        perf-stat.i.instructions
>       0.14            -5.2%       0.13        perf-stat.i.ipc
>     103.98            -9.7%      93.94        perf-stat.i.metric.K/sec
>   13534286           -11.0%   12040965        perf-stat.i.minor-faults
>   13534286           -11.0%   12040965        perf-stat.i.page-faults
>       7.64            -5.3%       7.23        perf-stat.overall.MPKI
>       0.05            -0.0        0.05        perf-stat.overall.branch-miss-rate%
>       7.20            +5.3%       7.58        perf-stat.overall.cpi
>     942.28           +11.2%       1047        perf-stat.overall.cycles-between-cache-misses
>       0.14            -5.0%       0.13        perf-stat.overall.ipc
>  2.678e+10            -4.1%  2.569e+10        perf-stat.ps.branch-instructions
>   14559650           -13.3%   12627015        perf-stat.ps.branch-misses
>  9.434e+08           -10.0%  8.491e+08        perf-stat.ps.cache-misses
>  1.112e+09            -9.5%  1.006e+09        perf-stat.ps.cache-references
>  1.235e+11            -4.9%  1.174e+11        perf-stat.ps.instructions
>   12270397           -10.0%   11048367        perf-stat.ps.minor-faults
>   12270398           -10.0%   11048367        perf-stat.ps.page-faults
>  7.755e+12            -5.9%    7.3e+12        perf-stat.total.instructions
>      42.85            -5.2       37.62        perf-profile.calltrace.cycles-pp.__munmap
>      42.85            -5.2       37.62        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
>      42.85            -5.2       37.62        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
>      42.85            -5.2       37.62        perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
>      42.85            -5.2       37.62        perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
>      42.85            -5.2       37.62        perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
>      42.85            -5.2       37.62        perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      42.85            -5.2       37.62        perf-profile.calltrace.cycles-pp.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
>      42.85            -5.2       37.62        perf-profile.calltrace.cycles-pp.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
>      41.78 ±  2%      -5.1       36.70        perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap
>      41.78 ±  2%      -5.1       36.70        perf-profile.calltrace.cycles-pp.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas.do_vmi_align_munmap.do_vmi_munmap
>      41.78 ±  2%      -5.1       36.70        perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.vms_clear_ptes.vms_complete_munmap_vmas
>      41.78 ±  2%      -5.1       36.70        perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.vms_clear_ptes
>      41.51 ±  2%      -5.1       36.45        perf-profile.calltrace.cycles-pp.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range
>      41.51 ±  2%      -5.1       36.45        perf-profile.calltrace.cycles-pp.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range
>      41.51 ±  2%      -5.1       36.45        perf-profile.calltrace.cycles-pp.tlb_flush_mmu.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
>      41.65            -5.1       36.60        perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache
>      41.63            -5.1       36.58        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs
>      41.65            -5.1       36.60        perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages
>      41.46 ±  2%      -5.0       36.41        perf-profile.calltrace.cycles-pp.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu.zap_pte_range
>      40.84 ±  2%      -4.9       35.90        perf-profile.calltrace.cycles-pp.__page_cache_release.folios_put_refs.free_pages_and_swap_cache.__tlb_batch_free_encoded_pages.tlb_flush_mmu
>       3.89 ±  4%      -2.4        1.53 ±  8%  perf-profile.calltrace.cycles-pp.si_meminfo.do_sysinfo.__do_sys_sysinfo.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       3.84 ±  4%      -2.4        1.49 ±  8%  perf-profile.calltrace.cycles-pp.nr_blockdev_pages.si_meminfo.do_sysinfo.__do_sys_sysinfo.do_syscall_64
>       3.82 ±  4%      -2.3        1.47 ±  9%  perf-profile.calltrace.cycles-pp._raw_spin_lock.nr_blockdev_pages.si_meminfo.do_sysinfo.__do_sys_sysinfo
>       3.74 ±  4%      -2.3        1.43 ±  9%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.nr_blockdev_pages.si_meminfo.do_sysinfo
>       3.10 ±  2%      -0.6        2.45 ±  2%  perf-profile.calltrace.cycles-pp.alloc_anon_folio.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
>       1.90            -0.4        1.52        perf-profile.calltrace.cycles-pp.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page.__handle_mm_fault.handle_mm_fault
>       1.84            -0.4        1.48        perf-profile.calltrace.cycles-pp.alloc_pages_mpol.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page.__handle_mm_fault
>       1.80            -0.4        1.44        perf-profile.calltrace.cycles-pp.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof.alloc_anon_folio.do_anonymous_page
>       1.70            -0.4        1.36        perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof.alloc_anon_folio
>       1.43 ±  6%      -0.3        1.12 ±  2%  perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.do_anonymous_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
>       1.26 ±  4%      -0.3        0.98 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.do_anonymous_page.__handle_mm_fault.handle_mm_fault
>       1.21            -0.3        0.95        perf-profile.calltrace.cycles-pp.prep_new_page.get_page_from_freelist.__alloc_frozen_pages_noprof.alloc_pages_mpol.vma_alloc_folio_noprof
>       1.16 ±  8%      -0.3        0.90 ±  5%  perf-profile.calltrace.cycles-pp.__mem_cgroup_charge.alloc_anon_folio.do_anonymous_page.__handle_mm_fault.handle_mm_fault
>       1.17            -0.3        0.92        perf-profile.calltrace.cycles-pp.clear_page_erms.prep_new_page.get_page_from_freelist.__alloc_frozen_pages_noprof.alloc_pages_mpol
>      44.15 ±  2%      +7.5       51.61 ±  2%  perf-profile.calltrace.cycles-pp.do_sysinfo.__do_sys_sysinfo.do_syscall_64.entry_SYSCALL_64_after_hwframe.sysinfo
>      44.32 ±  2%      +7.5       51.79 ±  2%  perf-profile.calltrace.cycles-pp.sysinfo
>      44.30 ±  2%      +7.5       51.77 ±  2%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sysinfo
>      44.30 ±  2%      +7.5       51.77 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sysinfo
>      44.28 ±  2%      +7.5       51.75 ±  2%  perf-profile.calltrace.cycles-pp.__do_sys_sysinfo.do_syscall_64.entry_SYSCALL_64_after_hwframe.sysinfo
>      40.25 ±  2%      +9.8       50.06 ±  2%  perf-profile.calltrace.cycles-pp.si_swapinfo.do_sysinfo.__do_sys_sysinfo.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      40.24 ±  2%      +9.8       50.06 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock.si_swapinfo.do_sysinfo.__do_sys_sysinfo.do_syscall_64
>      40.08 ±  2%      +9.8       49.92 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.si_swapinfo.do_sysinfo.__do_sys_sysinfo
>      44.76 ±  2%      -6.0       38.80 ±  4%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>      44.44 ±  2%      -5.9       38.56 ±  4%  perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
>      42.85            -5.2       37.62        perf-profile.children.cycles-pp.__munmap
>      42.85            -5.2       37.62        perf-profile.children.cycles-pp.__vm_munmap
>      42.85            -5.2       37.62        perf-profile.children.cycles-pp.__x64_sys_munmap
>      42.88            -5.2       37.65        perf-profile.children.cycles-pp.do_vmi_align_munmap
>      42.88            -5.2       37.65        perf-profile.children.cycles-pp.vms_clear_ptes
>      42.88            -5.2       37.65        perf-profile.children.cycles-pp.vms_complete_munmap_vmas
>      42.86            -5.2       37.64        perf-profile.children.cycles-pp.do_vmi_munmap
>      42.62            -5.2       37.40        perf-profile.children.cycles-pp.folios_put_refs
>      42.60            -5.2       37.40        perf-profile.children.cycles-pp.__tlb_batch_free_encoded_pages
>      42.60            -5.2       37.40        perf-profile.children.cycles-pp.free_pages_and_swap_cache
>      41.93            -5.1       36.84        perf-profile.children.cycles-pp.__page_cache_release
>      41.80 ±  2%      -5.1       36.72        perf-profile.children.cycles-pp.unmap_page_range
>      41.80 ±  2%      -5.1       36.72        perf-profile.children.cycles-pp.unmap_vmas
>      41.80 ±  2%      -5.1       36.72        perf-profile.children.cycles-pp.zap_pmd_range
>      41.80 ±  2%      -5.1       36.72        perf-profile.children.cycles-pp.zap_pte_range
>      41.51 ±  2%      -5.1       36.45        perf-profile.children.cycles-pp.tlb_flush_mmu
>       3.89 ±  4%      -2.4        1.53 ±  8%  perf-profile.children.cycles-pp.si_meminfo
>       3.84 ±  4%      -2.4        1.49 ±  8%  perf-profile.children.cycles-pp.nr_blockdev_pages
>       3.11 ±  2%      -0.6        2.46 ±  2%  perf-profile.children.cycles-pp.alloc_anon_folio
>       1.90            -0.4        1.52        perf-profile.children.cycles-pp.vma_alloc_folio_noprof
>       1.89            -0.4        1.52        perf-profile.children.cycles-pp.alloc_pages_mpol
>       1.84            -0.4        1.48        perf-profile.children.cycles-pp.__alloc_frozen_pages_noprof
>       1.73            -0.3        1.39        perf-profile.children.cycles-pp.get_page_from_freelist
>       0.56 ± 72%      -0.3        0.22 ±108%  perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
>       1.45 ±  6%      -0.3        1.14 ±  3%  perf-profile.children.cycles-pp.__pte_offset_map_lock
>       1.22            -0.3        0.96        perf-profile.children.cycles-pp.prep_new_page
>       1.16 ±  7%      -0.3        0.90 ±  5%  perf-profile.children.cycles-pp.__mem_cgroup_charge
>       1.19            -0.3        0.93        perf-profile.children.cycles-pp.clear_page_erms
>       0.26 ±  8%      -0.1        0.16 ±  3%  perf-profile.children.cycles-pp.handle_internal_command
>       0.26 ±  8%      -0.1        0.16 ±  3%  perf-profile.children.cycles-pp.main
>       0.26 ±  8%      -0.1        0.16 ±  3%  perf-profile.children.cycles-pp.run_builtin
>       0.44 ± 10%      -0.1        0.35 ±  6%  perf-profile.children.cycles-pp.free_unref_folios
>       0.25 ±  9%      -0.1        0.16 ±  3%  perf-profile.children.cycles-pp.record__mmap_read_evlist
>       0.40 ± 11%      -0.1        0.31 ±  6%  perf-profile.children.cycles-pp.free_frozen_page_commit
>       0.24 ±  8%      -0.1        0.16 ±  4%  perf-profile.children.cycles-pp.perf_mmap__push
>       0.38 ± 13%      -0.1        0.30 ±  7%  perf-profile.children.cycles-pp.free_pcppages_bulk
>       0.55            -0.1        0.48        perf-profile.children.cycles-pp.sync_regs
>       0.48 ±  4%      -0.1        0.42 ±  2%  perf-profile.children.cycles-pp.native_irq_return_iret
>       0.37 ±  4%      -0.1        0.31 ±  3%  perf-profile.children.cycles-pp.rmqueue
>       0.35 ±  4%      -0.1        0.30 ±  3%  perf-profile.children.cycles-pp.rmqueue_pcplist
>       0.19 ±  6%      -0.0        0.14 ±  3%  perf-profile.children.cycles-pp.record__pushfn
>       0.18 ±  7%      -0.0        0.13 ±  2%  perf-profile.children.cycles-pp.ksys_write
>       0.17 ±  5%      -0.0        0.13 ±  3%  perf-profile.children.cycles-pp.vfs_write
>       0.28 ±  5%      -0.0        0.24 ±  3%  perf-profile.children.cycles-pp.__rmqueue_pcplist
>       0.31            -0.0        0.27        perf-profile.children.cycles-pp.lru_add
>       0.16 ±  5%      -0.0        0.12 ±  3%  perf-profile.children.cycles-pp.shmem_file_write_iter
>       0.24 ±  6%      -0.0        0.20 ±  5%  perf-profile.children.cycles-pp.rmqueue_bulk
>       0.16 ±  4%      -0.0        0.12 ±  3%  perf-profile.children.cycles-pp.generic_perform_write
>       0.24 ±  2%      -0.0        0.20        perf-profile.children.cycles-pp.lru_gen_add_folio
>       0.21            -0.0        0.18        perf-profile.children.cycles-pp.lru_gen_del_folio
>       0.25 ±  2%      -0.0        0.22        perf-profile.children.cycles-pp.zap_present_ptes
>       0.14 ±  2%      -0.0        0.12 ±  3%  perf-profile.children.cycles-pp.lock_vma_under_rcu
>       0.14 ±  3%      -0.0        0.12 ±  4%  perf-profile.children.cycles-pp.__mod_node_page_state
>       0.13            -0.0        0.12 ±  4%  perf-profile.children.cycles-pp.__perf_sw_event
>       0.06 ±  7%      -0.0        0.05        perf-profile.children.cycles-pp.___pte_offset_map
>       0.09 ±  5%      -0.0        0.08        perf-profile.children.cycles-pp.__mem_cgroup_uncharge_folios
>       0.08 ±  6%      -0.0        0.06 ±  6%  perf-profile.children.cycles-pp.vma_merge_extend
>       0.11 ±  3%      -0.0        0.10        perf-profile.children.cycles-pp.__free_one_page
>       0.07            -0.0        0.06        perf-profile.children.cycles-pp.error_entry
>       0.06            -0.0        0.05        perf-profile.children.cycles-pp.__mod_zone_page_state
>       0.11            -0.0        0.10        perf-profile.children.cycles-pp.___perf_sw_event
>       0.10 ±  4%      +0.0        0.11 ±  4%  perf-profile.children.cycles-pp.sched_tick
>       0.21 ±  3%      +0.0        0.24 ±  5%  perf-profile.children.cycles-pp.update_process_times
>       0.22 ±  3%      +0.0        0.26 ±  7%  perf-profile.children.cycles-pp.tick_nohz_handler
>       0.30 ±  4%      +0.0        0.34 ±  6%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
>       0.29 ±  4%      +0.0        0.33 ±  6%  perf-profile.children.cycles-pp.hrtimer_interrupt
>       0.39 ±  2%      +0.0        0.43 ±  2%  perf-profile.children.cycles-pp.mremap
>       0.31 ±  4%      +0.0        0.36 ±  5%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
>       0.34 ±  3%      +0.0        0.39 ±  5%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
>       0.28 ±  3%      +0.1        0.34 ±  2%  perf-profile.children.cycles-pp.__do_sys_mremap
>       0.28 ±  2%      +0.1        0.34 ±  3%  perf-profile.children.cycles-pp.do_mremap
>       0.11 ±  4%      +0.1        0.17 ±  2%  perf-profile.children.cycles-pp.expand_vma
>       0.00            +0.1        0.08        perf-profile.children.cycles-pp.__vm_enough_memory
>       0.00            +0.1        0.09 ±  5%  perf-profile.children.cycles-pp.vrm_calc_charge
>       0.04 ±141%      +0.1        0.13 ± 16%  perf-profile.children.cycles-pp.add_callchain_ip
>       0.04 ±142%      +0.1        0.14 ± 17%  perf-profile.children.cycles-pp.thread__resolve_callchain_sample
>       0.04 ±142%      +0.1        0.17 ± 15%  perf-profile.children.cycles-pp.__thread__resolve_callchain
>       0.04 ±142%      +0.1        0.18 ± 15%  perf-profile.children.cycles-pp.sample__for_each_callchain_node
>       0.05 ±141%      +0.1        0.18 ± 14%  perf-profile.children.cycles-pp.build_id__mark_dso_hit
>       0.05 ±141%      +0.1        0.19 ± 14%  perf-profile.children.cycles-pp.perf_session__deliver_event
>       0.05 ±141%      +0.1        0.20 ± 14%  perf-profile.children.cycles-pp.__ordered_events__flush
>       0.05 ±141%      +0.1        0.20 ± 33%  perf-profile.children.cycles-pp.perf_session__process_events
>       0.05 ±141%      +0.1        0.20 ± 33%  perf-profile.children.cycles-pp.record__finish_output
>      88.59            +1.5       90.13        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
>      45.34 ±  2%      +7.2       52.54 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock
>      44.15 ±  2%      +7.5       51.61 ±  2%  perf-profile.children.cycles-pp.do_sysinfo
>      44.33 ±  2%      +7.5       51.80 ±  2%  perf-profile.children.cycles-pp.sysinfo
>      44.28 ±  2%      +7.5       51.75 ±  2%  perf-profile.children.cycles-pp.__do_sys_sysinfo
>      40.25 ±  2%      +9.8       50.07 ±  2%  perf-profile.children.cycles-pp.si_swapinfo
>       0.55 ± 74%      -0.3        0.22 ±107%  perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
>       1.50 ±  4%      -0.3        1.17        perf-profile.self.cycles-pp._raw_spin_lock
>       1.18            -0.3        0.92        perf-profile.self.cycles-pp.clear_page_erms
>       2.01            -0.2        1.86 ±  3%  perf-profile.self.cycles-pp.stress_bigheap_child
>       0.55            -0.1        0.48        perf-profile.self.cycles-pp.sync_regs
>       0.48 ±  4%      -0.1        0.42 ±  2%  perf-profile.self.cycles-pp.native_irq_return_iret
>       0.14 ±  3%      -0.0        0.12 ±  4%  perf-profile.self.cycles-pp.get_page_from_freelist
>       0.14 ±  8%      -0.0        0.12 ±  3%  perf-profile.self.cycles-pp.do_anonymous_page
>       0.14 ±  2%      -0.0        0.12 ±  3%  perf-profile.self.cycles-pp.rmqueue_bulk
>       0.14            -0.0        0.12        perf-profile.self.cycles-pp.lru_gen_del_folio
>       0.11 ±  3%      -0.0        0.09 ±  4%  perf-profile.self.cycles-pp.__handle_mm_fault
>       0.15 ±  2%      -0.0        0.13        perf-profile.self.cycles-pp.lru_gen_add_folio
>       0.12 ±  3%      -0.0        0.10 ±  3%  perf-profile.self.cycles-pp.zap_present_ptes
>       0.12 ±  4%      -0.0        0.11        perf-profile.self.cycles-pp.__mod_node_page_state
>       0.07 ±  6%      -0.0        0.06        perf-profile.self.cycles-pp.lock_vma_under_rcu
>       0.10            -0.0        0.09 ±  4%  perf-profile.self.cycles-pp.__free_one_page
>       0.11 ±  3%      -0.0        0.10        perf-profile.self.cycles-pp.folios_put_refs
>       0.07            -0.0        0.06        perf-profile.self.cycles-pp.___perf_sw_event
>       0.07            -0.0        0.06        perf-profile.self.cycles-pp.do_user_addr_fault
>       0.07            -0.0        0.06        perf-profile.self.cycles-pp.lru_add
>       0.07            -0.0        0.06        perf-profile.self.cycles-pp.mas_walk
>       0.08            -0.0        0.07        perf-profile.self.cycles-pp.__alloc_frozen_pages_noprof
>       0.06            -0.0        0.05        perf-profile.self.cycles-pp.handle_mm_fault
>       0.06            -0.0        0.05        perf-profile.self.cycles-pp.page_counter_uncharge
>       0.00            +0.1        0.08        perf-profile.self.cycles-pp.__vm_enough_memory
>      88.36            +1.5       89.85        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
> 
Could you please test below patch and confirm if it solves regression:

<snip>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ecbac900c35f..118de1a8348c 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3746,6 +3746,15 @@ vm_area_alloc_pages(gfp_t gfp, int nid,
 	return nr_allocated;
 }
 
+static void
+__vm_area_cleanup(struct vm_struct *area)
+{
+	if (area->pages)
+		vfree(area->addr);
+	else
+		free_vm_area(area);
+}
+
 static LLIST_HEAD(pending_vm_area_cleanup);
 static void cleanup_vm_area_work(struct work_struct *work)
 {
@@ -3756,12 +3765,8 @@ static void cleanup_vm_area_work(struct work_struct *work)
 	if (!head)
 		return;
 
-	llist_for_each_entry_safe(area, tmp, head, llnode) {
-		if (!area->pages)
-			free_vm_area(area);
-		else
-			vfree(area->addr);
-	}
+	llist_for_each_entry_safe(area, tmp, head, llnode)
+		__vm_area_cleanup(area);
 }
 
 /*
@@ -3769,8 +3774,11 @@ static void cleanup_vm_area_work(struct work_struct *work)
  * of partially initialized vm_struct in error paths.
  */
 static DECLARE_WORK(cleanup_vm_area, cleanup_vm_area_work);
-static void defer_vm_area_cleanup(struct vm_struct *area)
+static void vm_area_cleanup(struct vm_struct *area, bool can_block)
 {
+	if (can_block)
+		return __vm_area_cleanup(area);
+
 	if (llist_add(&area->llnode, &pending_vm_area_cleanup))
 		schedule_work(&cleanup_vm_area);
 }
@@ -3915,7 +3923,7 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 	return area->addr;
 
 fail:
-	defer_vm_area_cleanup(area);
+	vm_area_cleanup(area, gfpflags_allow_blocking(gfp_mask));
 	return NULL;
 }
<snip>


--
Uladzislau Rezki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ