netdev - Re: Possible Memory tracking bug with Intel ICE driver and jumbo frames

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <8407689d-8a6a-4a94-8aba-a8ca134838fc@intel.com>
Date: Tue, 8 Jul 2025 17:42:36 -0700
From: Jacob Keller <jacob.e.keller@...el.com>
To: Christoph Petrausch <christoph.petrausch@...pl.com>,
	<netdev@...r.kernel.org>
Subject: Re: Possible Memory tracking bug with Intel ICE driver and jumbo
 frames



On 4/30/2025 1:59 AM, Christoph Petrausch wrote:
> **
> 
> *Hello everyone,*
> 
> *
> 
> We have noticed that when a node is under memory pressure and receiving 
> a lot of traffic, the PFMemallocDrop counter increases. This is to be 
> expected as the kernel tries to protect important services with the 
> pfmemalloc mechanism. However, once the memory pressure is gone and we 
> have many gigabytes of free memory again, we still see the 
> PFMemallocDrop counter increasing. We also see incoming jumbo frames 
> from new TCP connections being dropped, existing TCP connections seem to 
> be unaffected. Packets with a packet size below 1500 are received 
> without any problems. If we reduce the interface's MTU to 1500 or below, 
> we can't reproduce the problem. Also, if a node is in a broken state, 
> setting the MTU to 1500 will fix the node. We can even increase the MTU 
> back to 9086 and don't see any dropped packets. We have observed this 
> behaviour with both the in kernel ICE driver and the third party Intel 
> driver [1].
> 
> 
> We can't reproduce the problem on kernel 5.15, but have seen it on 
> v5.17,v5,18 and v6.1, v6.2, v6.6.85, v6.8 and 
> v6.15-rc4-42-gb6ea1680d0ac. I'm in the process of git bisecting to find 
> the commit that introduced this broken behaviour.
> 
> On kernel 5.15, jumbo frames are received normally after the memory 
> pressure is gone.
> 

Sorry for the late reply. I might have a possible change that may impact
this as we found a memory leak in the way we handle multi-buffer frames
at 9K MTU

It may not be a complete fix, and I haven't yet reproduced your setup
with PFMemallocDrop, but I thought you might be interested in this
thread, and the resulting fix which I'll be posting soon. I will CC you
in the thread when I post it.

[1]:
https://lore.kernel.org/netdev/CAK8fFZ4hY6GUJNENz3wY9jaYLZXGfpr7dnZxzGMYoE44caRbgw@mail.gmail.com/

> 
> 
> To reproduce, we currently use 2 servers (server-rx, server-tx)with an 
> Intel E810-XXV NIC. To generate network traffic, we run 2 iperf3 
> processes with 100 threads each on the load generating server server-tx 
> iperf3 -c server-rx -P 100 -t 3000 -p 5201iperf3 -c server-rx -P 100 -t 
> 3000 -p 5202On the receiving server server-rx, we setup two iperf3 
> servers:iperf3 -s -p 5201iperf3 -s -p 5202
> 
> To generate memory pressure, we start stress-ng on the 
> server-rx:stress-ng --vm 1000 --vm-bytes $(free -g -L | awk '{ print $8 
> }')G --vm-keep --timeout 1200sThis consumes all the currently free 
> memory. As soon as the PFMemallocDrop counter increases, we stop 
> stress-ng. Now we see plenty of free memory again, but the counter is 
> still increasing and we have seen problems with new TCP sessions, as 
> soon as their packet size is above 1500 bytes.[1] 
> https://github.com/intel/ethernet-linux-ice Best regards, Christoph 
> Petrausch*
> 
> 



Download attachment "OpenPGP_signature.asc" of type "application/pgp-signature" (237 bytes)