netdev - Re: Poor thunderbolt-net interface performance when bridged

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <38B49EF9-4A56-4004-91CF-5A2D591E202D@bejarano.io>
Date: Tue, 27 May 2025 20:57:49 +0200
From: Ricard Bejarano <ricard@...arano.io>
To: Andrew Lunn <andrew@...n.ch>
Cc: Mika Westerberg <mika.westerberg@...ux.intel.com>,
 netdev@...r.kernel.org,
 michael.jamet@...el.com,
 YehezkelShB@...il.com,
 andrew+netdev@...n.ch,
 davem@...emloft.net,
 edumazet@...gle.com,
 kuba@...nel.org,
 pabeni@...hat.com
Subject: Re: Poor thunderbolt-net interface performance when bridged

> Maybe hack out this test, and allow the corrupt frame to be
> received. Then look at it with Wireshark and see if you can figure out
> what is wrong with it. Knowing what is wrong with it might allow you
> to backtrack to where it gets mangled.

I've done this:

diff --git a/drivers/net/thunderbolt/main.c b/drivers/net/thunderbolt/main.c
index 0a53ec2..8db0301 100644
--- a/drivers/net/thunderbolt/main.c
+++ b/drivers/net/thunderbolt/main.c
@@ -736,7 +736,7 @@ static bool tbnet_check_frame(struct tbnet *net, const struct tbnet_frame *tf,
 
        if (tf->frame.flags & RING_DESC_CRC_ERROR) {
                net->stats.rx_crc_errors++;
-               return false;
+               return true;
        } else if (tf->frame.flags & RING_DESC_BUFFER_OVERRUN) {
                net->stats.rx_over_errors++;
                return false;

Then set up iperf3 and tcpdump, but kernel panics:

May 27 18:30:32 blue kernel: skbuff: skb_over_panic: text:ffffffffc0c1b9e7 len:1545195755 put:1545195755 head:ffff9c9dcc652000 data:ffff9c9dcc65200c tail:0x5c19d0f7 end:0x1ec0 dev:<NULL>

This is the last and only line I see in journalctl.

This only happens in tests with loss. 1/10/100Mbps doesn't panic. But as soon as
I get near the ~250-300Mbps inflection point I mentioned earlier, it hangs
forever (panics). tcpdump doesn't write anything to disk when that happens, so
how could we capture this?

Is there a way I can tcpdump or similar before the driver reads the packet?

Perhaps modify the driver so it writes the skb's somewhere itself?
Would the performance hit affect the measurement?

Thanks again,
RB