lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2aca9aff-1d96-43fd-8125-290e7600915e@hetzner-cloud.de>
Date: Thu, 11 Dec 2025 19:00:44 +0100
From: Marcus Wichelmann <marcus.wichelmann@...zner-cloud.de>
To: Jacob Keller <jacob.e.keller@...el.com>,
 Tony Nguyen <anthony.l.nguyen@...el.com>,
 Przemek Kitszel <przemyslaw.kitszel@...el.com>,
 Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller"
 <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 intel-wired-lan@...ts.osuosl.org, Netdev <netdev@...r.kernel.org>,
 linux-kernel@...r.kernel.org
Cc: sdn@...zner-cloud.de
Subject: Re: [Intel-wired-lan] [BUG] ice: Temporary packet processing overload
 causes permanent RX drops

Am 09.12.25 um 01:05 schrieb Jacob Keller:
> On 12/5/2025 6:01 AM, Marcus Wichelmann wrote:
>> Hi there, I broke some network cards again. This time I noticed continuous RX packet drops with an Intel E810-XXV.
>>>> We have reproduced this with:
>>   Linux 6.8.0-88-generic (Ubuntu 24.04)
>>   Linux 6.14.0-36-generic (Ubuntu 24.04 HWE)
>>   Linux 6.18.0-061800-generic (Ubuntu Mainline PPA)
> 
> I think we recently merged a bunch of work on the Rx path as part of our
> conversion to page pool. It would be interesting to see if those changes
> impact this. Clearly the issue goes back some time since v6.8 at least..
Hi Jacob,

I guess you mean 93f53db9f9dc ("ice: switch to Page Pool")?

I have now repeated all tests with a kernel built from latest net-next
branch and can still reproduce it, even though I needed way higher packet
rates (15 instead of 4 Mpps when using 256 channels). Something about the
packet processing on our test system seems to have gotten way more
efficient with this kernel update.

The symptoms are the same. The following IO_PAGE_FAULTs appear in the
kernel log and after that, there is a permanent packet loss of 1-10%
even at very low packet rates.

  kernel: ice 0000:c7:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x002b address=0x4000180000 flags=0x0020]
  kernel: ice 0000:c7:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x002b address=0x4000180000 flags=0x0020]
  kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
  kernel: ice 0000:c7:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x002b address=0x4000180000 flags=0x0020]
  kernel: workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
  kernel: ice 0000:c7:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x002c address=0x4000180000 flags=0x0020]
  [...]
  kernel: ice 0000:c7:00.1: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x002c address=0x4000180000 flags=0x0020]
  kernel: amd_iommu_report_page_fault: 10 callbacks suppressed
  [...]

I experimented with a few different channel counts and noticed that
the issue only occurs with a combined channel count >128. So on
systems with less many CPU cores, this bug probably never occurs.

  256: reproduced.
  254: reproduced.
  200: reproduced.
  129: reproduced.
  128: stable.
   64: stable.

Tested using "ethtool -L eth{0,1} combined XXX".

With <=128 channels, only the "... hogged CPU ..." warnings appear
but no IO_PAGE_FAULTs. There is also no permanent packet loss after
stopping the traffic generator.

>> [...]
>>
>> 3. Stop the traffic generator and re-run it with a way lower packet rate, e.g. 10.000 pps. Now it can be seen that
>> a good part of these packets is being dropped, even though the kernel could easily keep up with this small packet rate.
> 
> I assume the rx_dropped counter still incrementing here?

Yes. After the NIC is in this broken state, a few percent of all
packets is being dropped and the rx_dropped counter increases
with each of them.

>> [...]

I also looked into why the packet processing load on this system
is so high and `perf top` shows that it almost completely
originates from native_queued_spin_lock_slowpath.

When digging deeper using `perf lock contention -Y spinlock`:

 contended   total wait     max wait     avg wait         type   caller
   1724043      4.36 m     198.66 us    151.66 us     spinlock   __netif_receive_skb_core.constprop.0+0x832
     35960      2.51 s     112.57 ms     69.51 us     spinlock   __netif_receive_skb_core.constprop.0+0x832
       620    103.79 ms    189.87 us    167.40 us     spinlock   do_sys_poll+0x26f

I'm not yet sure what is causing this.
I don't think it's related to this issue, but maybe that's part of
what brings this bug to daylight, so probably still worth a mention.

I hope you can make some sense of all that.

Thanks,
Marcus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ