lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <b3eb99da-9293-43e8-a24d-f4082f747d6c@intel.com>
Date: Wed, 25 Jun 2025 16:03:19 +0200
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: Jaroslav Pulchart <jaroslav.pulchart@...ddata.com>,
	"intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>
CC: "Keller, Jacob E" <jacob.e.keller@...el.com>, Jakub Kicinski
	<kuba@...nel.org>, "Damato, Joe" <jdamato@...tly.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "Nguyen, Anthony L"
	<anthony.l.nguyen@...el.com>, Michal Swiatkowski
	<michal.swiatkowski@...ux.intel.com>, "Czapnik, Lukasz"
	<lukasz.czapnik@...el.com>, "Dumazet, Eric" <edumazet@...gle.com>, "Zaki,
 Ahmed" <ahmed.zaki@...el.com>, Martin Karsten <mkarsten@...terloo.ca>, "Igor
 Raits" <igor@...ddata.com>, Daniel Secik <daniel.secik@...ddata.com>, "Zdenek
 Pesek" <zdenek.pesek@...ddata.com>
Subject: Re: [Intel-wired-lan] Increased memory usage on NUMA nodes with ICE
 driver after upgrade to 6.13.y (regression in commit 492a044508ad)

On 6/25/25 14:17, Jaroslav Pulchart wrote:
> Hello
> 
> We are still facing the memory issue with Intel 810 NICs (even on latest 
> 6.15.y).
> 
> Our current stabilization and solution is to move everything to a new 
> INTEL-FREE server and get rid of last Intel sights there (after Intel's 
> CPU vulnerabilities fuckups NICs are next step).
> 
> Any help welcomed,
> Jaroslav P.
> 
> 

Thank you for urging us, I can understand the frustration.

We have identified some (unrelated) memory leaks, will soon ship fixes.
And, as there were no clear issue with any commit/version you have
posted to be a culprit, there is a chance that our random findings could
help. Anyway going to zero kmemleak reports is good in itself, that is
a good start.

Will ask my VAL too to increase efforts in this area too.

Przemek

> 
> st 4. 6. 2025 v 10:42 odesílatel Jaroslav Pulchart 
> <jaroslav.pulchart@...ddata.com <mailto:jaroslav.pulchart@...ddata.com>> 
> napsal:
> 
>      >
>      > čt 17. 4. 2025 v 19:52 odesílatel Keller, Jacob E
>      > <jacob.e.keller@...el.com <mailto:jacob.e.keller@...el.com>> napsal:
>      > >
>      > >
>      > >
>      > > > -----Original Message-----
>      > > > From: Jakub Kicinski <kuba@...nel.org <mailto:kuba@...nel.org>>
>      > > > Sent: Wednesday, April 16, 2025 5:13 PM
>      > > > To: Keller, Jacob E <jacob.e.keller@...el.com
>     <mailto:jacob.e.keller@...el.com>>
>      > > > Cc: Jaroslav Pulchart <jaroslav.pulchart@...ddata.com
>     <mailto:jaroslav.pulchart@...ddata.com>>; Kitszel, Przemyslaw
>      > > > <przemyslaw.kitszel@...el.com
>     <mailto:przemyslaw.kitszel@...el.com>>; Damato, Joe
>     <jdamato@...tly.com <mailto:jdamato@...tly.com>>; intel-wired-
>      > > > lan@...ts.osuosl.org <mailto:lan@...ts.osuosl.org>;
>     netdev@...r.kernel.org <mailto:netdev@...r.kernel.org>; Nguyen,
>     Anthony L
>      > > > <anthony.l.nguyen@...el.com
>     <mailto:anthony.l.nguyen@...el.com>>; Igor Raits <igor@...ddata.com
>     <mailto:igor@...ddata.com>>; Daniel Secik
>      > > > <daniel.secik@...ddata.com
>     <mailto:daniel.secik@...ddata.com>>; Zdenek Pesek
>     <zdenek.pesek@...ddata.com <mailto:zdenek.pesek@...ddata.com>>;
>      > > > Dumazet, Eric <edumazet@...gle.com
>     <mailto:edumazet@...gle.com>>; Martin Karsten
>      > > > <mkarsten@...terloo.ca <mailto:mkarsten@...terloo.ca>>; Zaki,
>     Ahmed <ahmed.zaki@...el.com <mailto:ahmed.zaki@...el.com>>; Czapnik,
>      > > > Lukasz <lukasz.czapnik@...el.com
>     <mailto:lukasz.czapnik@...el.com>>; Michal Swiatkowski
>      > > > <michal.swiatkowski@...ux.intel.com
>     <mailto:michal.swiatkowski@...ux.intel.com>>
>      > > > Subject: Re: [Intel-wired-lan] Increased memory usage on NUMA
>     nodes with ICE
>      > > > driver after upgrade to 6.13.y (regression in commit
>     492a044508ad)
>      > > >
>      > > > On Wed, 16 Apr 2025 22:57:10 +0000 Keller, Jacob E wrote:
>      > > > > > > And you're reverting just and exactly 492a044508ad13 ?
>      > > > > > > The memory for persistent config is allocated in
>     alloc_netdev_mqs()
>      > > > > > > unconditionally. I'm lost as to how this commit could
>     make any
>      > > > > > > difference :(
>      > > > > >
>      > > > > > Yes, reverted the 492a044508ad13.
>      > > > >
>      > > > > Struct napi_config *is* 1056 bytes
>      > > >
>      > > > You're probably looking at 6.15-rcX kernels. Yes, the
>     affinity mask
>      > > > can be large depending on the kernel config. But report is
>     for 6.13,
>      > > > AFAIU. In 6.13 and 6.14 napi_config was tiny.
>      > >
>      > > Regardless, it should still be ~64KB even in that case which is
>     a far cry from eating all available memory. Something else must be
>     going on....
>      > >
>      > > Thanks,
>      > > Jake
>      >
>      > Hello
>      >
>      > Some observation, this "problem" still exists with the latest 6.14.y
>      > and there must be multiple issues, the memory utilization is slowly
>      > going down, from 3GB to 100MB in 10-20days. at home NUMA nodes where
>      > intel x810 NIC are (looks like some memory leak related to
>      > networking).
>      >
>      > So without the revert the kawadX usage is observed asap like till
>      > 1-2d, with revert of mentioned commit kswadX starts to consume
>      > resources later like in ~10d-20d later. It is almost impossible
>     to use
>      > servers with Intel X810 cards (ice driver) with recent linux kernels.
>      >
>      > Were you able to reproduce the memory problems in your testbed?
>      >
>      > Best,
>      > Jaroslav
> 
>     Hello
> 
>     I deployed linux 6.15.0 to our servers 7d ago and observed the
>     behaviour of memory utilization of NUMA home nodes of Intel X810
>     1/ there is no need to revert the commit as before,
>     2/ the memory is continuously consumed (like memory leak),
>     see attached "7d_memory_usage_per_numa_linux6.15.0.png" screenshot 8x
>     numa nodes, (NUMA0 + NUMA1 are local for X810 nics). BTW: We do not
>     see this memory utilization pattern on server s using Broadcom
>     Netxtreme-E NICs
> 
> 
> 
> -- 
> Jaroslav Pulchart
> Sr. Principal SW Engineer
> GoodData


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ