netdev - Re: Increased memory usage on NUMA nodes with ICE driver after upgrade to 6.13.y (regression in commit 492a044508ad)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAK8fFZ6ML1v8VCjN3F-r+SFT8oF0xNpi3hjA77aRNwr=HcWqNA@mail.gmail.com>
Date: Wed, 16 Apr 2025 09:13:23 +0200
From: Jaroslav Pulchart <jaroslav.pulchart@...ddata.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Przemek Kitszel <przemyslaw.kitszel@...el.com>, jdamato@...tly.com, 
	intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org, 
	Tony Nguyen <anthony.l.nguyen@...el.com>, Igor Raits <igor@...ddata.com>, 
	Daniel Secik <daniel.secik@...ddata.com>, Zdenek Pesek <zdenek.pesek@...ddata.com>, 
	Eric Dumazet <edumazet@...gle.com>, Martin Karsten <mkarsten@...terloo.ca>, 
	Ahmed Zaki <ahmed.zaki@...el.com>, "Czapnik, Lukasz" <lukasz.czapnik@...el.com>, 
	Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>
Subject: Re: Increased memory usage on NUMA nodes with ICE driver after
 upgrade to 6.13.y (regression in commit 492a044508ad)

st 16. 4. 2025 v 2:54 odesílatel Jakub Kicinski <kuba@...nel.org> napsal:
>
> On Tue, 15 Apr 2025 16:38:40 +0200 Przemek Kitszel wrote:
> > > We traced the issue to commit 492a044508ad13a490a24c66f311339bf891cb5f
> > > "ice: Add support for persistent NAPI config".
> >
> > thank you for the report and bisection,
> > this commit is ice's opt-in into using persistent napi_config
> >
> > I have checked the code, and there is nothing obvious to inflate memory
> > consumption in the driver/core in the touched parts. I have not yet
> > looked into how much memory is eaten by the hash array of now-kept
> > configs.
>
> +1 also unclear to me how that commit makes any difference.
>
> Jaroslav, when you say "traced" what do you mean?
> CONFIG_MEM_ALLOC_PROFILING ?
>
> The napi_config struct is just 24B. The queue struct (we allocate
> napi_config for each queue) is 320B...

By "traced" I mean using the kernel and checking memory situation on
numa nodes with and without production load.  Numa nodes, with X810
NIC, showing a quite less available memory with default queue length
(num of all cpus) and it needs to be lowered to 1-2 (for unused
interfaces) and up-to-count of numa node cores on used interfaces to
make the memory allocation reasonable and server avoiding "kswapd"...

See "MemFree" on numa 0 + 1 on different/smaller but utilized (running
VMs + using network) host server with 8 numa nodes (32GB RAM each, 28G
in Hugepase for VMs and 4GB for host os):

6.13.y vanilla (lot of kswapd0 in background):
    NUMA nodes:     0       1       2       3       4       5       6       7
    HPTotalGiB:     28      28      28      28      28      28      28      28
    HPFreeGiB:      0       0       0       0       0       0       0       0
    MemTotal:       32220   32701   32701   32686   32701   32701
32701   32696
    MemFree:        274     254     1327    1928    1949    2683    2624    2769
6.13.y + Revert (no memory issues at all):
    NUMA nodes: 0 1 2 3 4 5 6 7
    HPTotalGiB: 28 28 28 28 28 28 28 28
    HPFreeGiB: 0 0 0 0 0 0 0 0
    MemTotal: 32220 32701 32701 32686 32701 32701 32701 32696
    MemFree: 2213 2438 3402 3108 2846 2672 2592 3063

We need to lower the queue on all X810 interfaces from default (64 in
this case), to ensure we have memory available for host OS services.
    ethtool -L em2 combined 1
    ethtool -L p3p2 combined 1
    ethtool -L em1 combined 6
    ethtool -L p3p1 combined 6
This trick "does not work" without the revert.