lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e4b6sjeh22uqhxhxudsbanlnyo2potwowuy7mkrp6tvxnftjn4@mcjyes2s3eu6>
Date: Wed, 17 Dec 2025 12:52:55 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: Uladzislau Rezki <urezki@...il.com>
Cc: Oliver Sang <oliver.sang@...el.com>, oe-lkp@...ts.linux.dev, 
	lkp@...el.com, linux-kernel@...r.kernel.org, 
	Andrew Morton <akpm@...ux-foundation.org>, Michal Hocko <mhocko@...e.com>, Baoquan He <bhe@...hat.com>, 
	Alexander Potapenko <glider@...gle.com>, Andrey Ryabinin <ryabinin.a.a@...il.com>, 
	Marco Elver <elver@...gle.com>, Michal Hocko <mhocko@...nel.org>, linux-mm@...ck.org
Subject: Re: [linus:master] [mm/vmalloc]  9c47753167:
 stress-ng.bigheap.realloc_calls_per_sec 21.3% regression

On Wed, Dec 17, 2025 at 12:04:20PM +0100, Uladzislau Rezki wrote:
> Hello, Oliver.
> 
> > > > 
> > > > Hello,
> > > > 
> > > > kernel test robot noticed a 21.3% regression of stress-ng.bigheap.realloc_calls_per_sec on:
> > > > 
> > > > 
> > > > commit: 9c47753167a6a585d0305663c6912f042e131c2d ("mm/vmalloc: defer freeing partly initialized vm_struct")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > 
> > > > [still regression on linus/master      c9b47175e9131118e6f221cc8fb81397d62e7c91]
> > > > [still regression on linux-next/master 008d3547aae5bc86fac3eda317489169c3fda112]
> > > > 
> > > > testcase: stress-ng
> > > > config: x86_64-rhel-9.4
> > > > compiler: gcc-14
> > > > test machine: 256 threads 2 sockets Intel(R) Xeon(R) 6767P  CPU @ 2.4GHz (Granite Rapids) with 256G memory
> > > > parameters:
> > > > 
> > > > 	nr_threads: 100%
> > > > 	testtime: 60s
> > > > 	test: bigheap
> > > > 	cpufreq_governor: performance
> > > > 
> > > > 
> > > > 
> > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > > the same patch/commit), kindly add following tags
> > > > | Reported-by: kernel test robot <oliver.sang@...el.com>
> > > > | Closes: https://lore.kernel.org/oe-lkp/202512121138.986f6a6b-lkp@intel.com
> > > > 
> > > > 
> > 
> > [...]
> > 
> > > > 
> > > Could you please test below patch and confirm if it solves regression:
> > 
> > we directly apply the patch upon 9c47753167, so our test branch looks like below
> > 
> > * f7991e8a0136cb <---- below patch from you
> > * 9c47753167a6a5 mm/vmalloc: defer freeing partly initialized vm_struct
> > * 86e968d8ca6dc8 mm/vmalloc: support non-blocking GFP flags in alloc_vmap_area()
> > 
> > but found it has little performance impacts
> > 
> > =========================================================================================
> > compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> >   gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-gnr-2sp3/bigheap/stress-ng/60s
> > 
> > 86e968d8ca6dc823 9c47753167a6a585d0305663c69 f7991e8a0136cb0fdf35f11e28a
> > ---------------- --------------------------- ---------------------------
> >          %stddev     %change         %stddev     %change         %stddev
> >              \          |                \          |                \
> >   48320196           -10.9%   43072080           -10.8%   43116499        stress-ng.bigheap.ops
> >     785159            -9.8%     708390            -9.7%     708644        stress-ng.bigheap.ops_per_sec
> >     879805           -21.3%     692805           -20.7%     697312        stress-ng.bigheap.realloc_calls_per_sec
> > 
> Thank you for testing. I had same expectations. No difference.
> Honestly i can not figure out how: 
> 
> * 9c47753167a6a5 mm/vmalloc: defer freeing partly initialized vm_struct
> * 86e968d8ca6dc8 mm/vmalloc: support non-blocking GFP flags in alloc_vmap_area()
> 
> can effect performance. I am not doing anything related to performance.
> I would like to ask you if you could test one more thing. I see that
> 
> [still regression on linus/master      c9b47175e9131118e6f221cc8fb81397d62e7c91]
> 
> contains also below patch:
> 
> <snip>
> commit a0615780439938e8e61343f1f92a4c54a71dc6a5
>     mm/vmalloc: request large order pages from buddy allocator
> <snip>
> 
> where we try to use larger order for vmalloc. Could you please revert
> it and rerun same tests?
> 

This being stress-ng it is not doing what you think it is doing.

Profile shows increased contention on swapinfo spinlock:

        %stddev     %change         %stddev
             \          |                \
     40.08 ±  2%      +9.8       49.92 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.si_swapinfo.do_sysinfo.__do_sys_sysinfo

The spinlock and the data it operates on are not annotated.

The commit deferring freeing adds 2 global vars which most likely
shifted things around to add cacheline bouncing.

That's a 3rd case in last few weeks that I know of.

I asked gcc people to do osmething about it, so far no takers:  https://gcc.gnu.org/pipermail/gcc/2024-October/245004.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ