lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <06415c07-5f29-4e1d-99c3-29e76cc2f1ae@deepl.com>
Date: Wed, 30 Apr 2025 10:59:32 +0200
From: Christoph Petrausch <christoph.petrausch@...pl.com>
To: netdev@...r.kernel.org
Subject: Possible Memory tracking bug with Intel ICE driver and jumbo frames

**

*Hello everyone,*

*

We have noticed that when a node is under memory pressure and receiving 
a lot of traffic, the PFMemallocDrop counter increases. This is to be 
expected as the kernel tries to protect important services with the 
pfmemalloc mechanism. However, once the memory pressure is gone and we 
have many gigabytes of free memory again, we still see the 
PFMemallocDrop counter increasing. We also see incoming jumbo frames 
from new TCP connections being dropped, existing TCP connections seem to 
be unaffected. Packets with a packet size below 1500 are received 
without any problems. If we reduce the interface's MTU to 1500 or below, 
we can't reproduce the problem. Also, if a node is in a broken state, 
setting the MTU to 1500 will fix the node. We can even increase the MTU 
back to 9086 and don't see any dropped packets. We have observed this 
behaviour with both the in kernel ICE driver and the third party Intel 
driver [1].


We can't reproduce the problem on kernel 5.15, but have seen it on 
v5.17,v5,18 and v6.1, v6.2, v6.6.85, v6.8 and 
v6.15-rc4-42-gb6ea1680d0ac. I'm in the process of git bisecting to find 
the commit that introduced this broken behaviour.

On kernel 5.15, jumbo frames are received normally after the memory 
pressure is gone.



To reproduce, we currently use 2 servers (server-rx, server-tx)with an 
Intel E810-XXV NIC. To generate network traffic, we run 2 iperf3 
processes with 100 threads each on the load generating server server-tx 
iperf3 -c server-rx -P 100 -t 3000 -p 5201iperf3 -c server-rx -P 100 -t 
3000 -p 5202On the receiving server server-rx, we setup two iperf3 
servers:iperf3 -s -p 5201iperf3 -s -p 5202

To generate memory pressure, we start stress-ng on the 
server-rx:stress-ng --vm 1000 --vm-bytes $(free -g -L | awk '{ print $8 
}')G --vm-keep --timeout 1200sThis consumes all the currently free 
memory. As soon as the PFMemallocDrop counter increases, we stop 
stress-ng. Now we see plenty of free memory again, but the counter is 
still increasing and we have seen problems with new TCP sessions, as 
soon as their packet size is above 1500 bytes.[1] 
https://github.com/intel/ethernet-linux-ice Best regards, Christoph 
Petrausch*


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ