netdev - Re: [Intel-wired-lan] Increased memory usage on NUMA nodes with ICE driver after upgrade to 6.13.y (regression in commit 492a044508ad)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c764ad97-9c6a-46f5-a03b-cfa812cdb8e1@intel.com>
Date: Mon, 30 Jun 2025 09:02:18 -0700
From: Jacob Keller <jacob.e.keller@...el.com>
To: Jaroslav Pulchart <jaroslav.pulchart@...ddata.com>, Jakub Kicinski
	<kuba@...nel.org>
CC: Przemek Kitszel <przemyslaw.kitszel@...el.com>,
	"intel-wired-lan@...ts.osuosl.org" <intel-wired-lan@...ts.osuosl.org>,
	"Damato, Joe" <jdamato@...tly.com>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "Nguyen, Anthony L" <anthony.l.nguyen@...el.com>,
	Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>, "Czapnik, Lukasz"
	<lukasz.czapnik@...el.com>, "Dumazet, Eric" <edumazet@...gle.com>, "Zaki,
 Ahmed" <ahmed.zaki@...el.com>, Martin Karsten <mkarsten@...terloo.ca>, "Igor
 Raits" <igor@...ddata.com>, Daniel Secik <daniel.secik@...ddata.com>, "Zdenek
 Pesek" <zdenek.pesek@...ddata.com>
Subject: Re: [Intel-wired-lan] Increased memory usage on NUMA nodes with ICE
 driver after upgrade to 6.13.y (regression in commit 492a044508ad)



On 6/30/2025 12:35 AM, Jaroslav Pulchart wrote:
>>
>>>
>>> On Wed, 25 Jun 2025 19:51:08 +0200 Jaroslav Pulchart wrote:
>>>> Great, please send me a link to the related patch set. I can apply them in
>>>> our kernel build and try them ASAP!
>>>
>>> Sorry if I'm repeating the question - have you tried
>>> CONFIG_MEM_ALLOC_PROFILING? Reportedly the overhead in recent kernels
>>> is low enough to use it for production workloads.
>>
>> I try it now, the fresh booted server:
>>
>> # sort -g /proc/allocinfo| tail -n 15
>>     45409728   236509 fs/dcache.c:1681 func:__d_alloc
>>     71041024    17344 mm/percpu-vm.c:95 func:pcpu_alloc_pages
>>     71524352    11140 kernel/dma/direct.c:141 func:__dma_direct_alloc_pages
>>     85098496     4486 mm/slub.c:2452 func:alloc_slab_page
>>    115470992   101647 fs/ext4/super.c:1388 [ext4] func:ext4_alloc_inode
>>    134479872    32832 kernel/events/ring_buffer.c:811 func:perf_mmap_alloc_page
>>    141426688    34528 mm/filemap.c:1978 func:__filemap_get_folio
>>    191594496    46776 mm/memory.c:1056 func:folio_prealloc
>>    360710144      172 mm/khugepaged.c:1084 func:alloc_charge_folio
>>    444076032    33790 mm/slub.c:2450 func:alloc_slab_page
>>    530579456   129536 mm/page_ext.c:271 func:alloc_page_ext
>>    975175680      465 mm/huge_memory.c:1165 func:vma_alloc_anon_folio_pmd
>>   1022427136   249616 mm/memory.c:1054 func:folio_prealloc
>>   1105125376   139252 drivers/net/ethernet/intel/ice/ice_txrx.c:681
>> [ice] func:ice_alloc_mapped_page
>>   1621598208   395848 mm/readahead.c:186 func:ractl_alloc_folio
>>
> 
> The "drivers/net/ethernet/intel/ice/ice_txrx.c:681 [ice]
> func:ice_alloc_mapped_page" is just growing...
> 
> # uptime ; sort -g /proc/allocinfo| tail -n 15
>  09:33:58 up 4 days, 6 min,  1 user,  load average: 6.65, 8.18, 9.81
> 
> # sort -g /proc/allocinfo| tail -n 15
>     85216896   443838 fs/dcache.c:1681 func:__d_alloc
>    106156032    25917 mm/shmem.c:1854 func:shmem_alloc_folio
>    116850096   102861 fs/ext4/super.c:1388 [ext4] func:ext4_alloc_inode
>    134479872    32832 kernel/events/ring_buffer.c:811 func:perf_mmap_alloc_page
>    143556608     6894 mm/slub.c:2452 func:alloc_slab_page
>    186793984    45604 mm/memory.c:1056 func:folio_prealloc
>    362807296    88576 mm/percpu-vm.c:95 func:pcpu_alloc_pages
>    530579456   129536 mm/page_ext.c:271 func:alloc_page_ext
>    598237184    51309 mm/slub.c:2450 func:alloc_slab_page
>    838860800      400 mm/huge_memory.c:1165 func:vma_alloc_anon_folio_pmd
>    929083392   226827 mm/filemap.c:1978 func:__filemap_get_folio
>   1034657792   252602 mm/memory.c:1054 func:folio_prealloc
>   1262485504      602 mm/khugepaged.c:1084 func:alloc_charge_folio
>   1335377920   325970 mm/readahead.c:186 func:ractl_alloc_folio
>   2544877568   315003 drivers/net/ethernet/intel/ice/ice_txrx.c:681
> [ice] func:ice_alloc_mapped_page
> 
ice_alloc_mapped_page is the function used to allocate the pages for the
Rx ring buffers.

There were a number of fixes for the hot path from Maciej which might be
related. Although those fixes were primarily for XDP they do impact the
regular hot path as well.

These were fixes on top of work he did which landed in v6.13, so it
seems plausible they might be related. In particular one which mentions
a missing buffer put:

743bbd93cf29 ("ice: put Rx buffers after being done with current frame")

It says the following:
>     While at it, address an error path of ice_add_xdp_frag() - we were
>     missing buffer putting from day 1 there.
> 

It seems to me the issue must be somehow related to the buffer cleanup
logic for the Rx ring, since thats the only thing allocated by
ice_alloc_mapped_page.

It might be something fixed with the work Maciej did.. but it seems very
weird that 492a044508ad ("ice: Add support for persistent NAPI config")
would affect that logic at all....


Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (237 bytes)