netdev - Re: Poor thunderbolt-net interface performance when bridged

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <48EE4097-8685-47D1-8C20-EE18A147A020@bejarano.io>
Date: Thu, 29 May 2025 12:06:45 +0200
From: Ricard Bejarano <ricard@...arano.io>
To: Andrew Lunn <andrew@...n.ch>
Cc: Mika Westerberg <mika.westerberg@...ux.intel.com>,
 netdev@...r.kernel.org,
 michael.jamet@...el.com,
 YehezkelShB@...il.com,
 andrew+netdev@...n.ch,
 davem@...emloft.net,
 edumazet@...gle.com,
 kuba@...nel.org,
 pabeni@...hat.com
Subject: Re: Poor thunderbolt-net interface performance when bridged

So here's what I've observed about tbnet_xmit_csum_and_map after sprinkling
counters all over it and running various tests:

1. skb->ip_summed is never != CHECKSUM_PARTIAL, so we never execute L1004-L1014.

2. protocol is always == htons(ETH_P_IP), so we:
     2.1. Never execute L1021-L1027.
     2.2. Always execute L1036-L1051. And specifically, we execute L1045 N+1
          times per iperf3 test (where N is the total packets sent as reported
          by iperf3), meaning those are ip_hdr(skb)->protocol == IPPROTO_UDP;
          and L1043 a total of 14 times (2 times 7, interesting), meaning those
          are ip_hdr(skb)->protocol == IPPROTO_TCP. From other iperf3 UDP test
          packet captures I'm confident these 14 TCP packets are iperf3 control
          plane things, like the cookie, a couple JSONs with test metadata, etc.
     2.3. Never execute L1052-L1064.

3. Once again, both lossless and lossy tests share the same execution pattern.
   There's not a single logic branch of tbnet_xmit_csum_and_map that is only
   executed when there's loss.

It's interesting, however, that the number of TCP packets is exactly twice that
of the number of non-linear sk_buffs we saw in tbnet_start_xmit. Not that it's
suspicious, if anything (and because we see those TCP packets in blue) it tells
us that the handling of non-linear skbs is not the problem in tbnet_start_xmit.
But why twice? Or is this a red herring?

I'm running out of ideas, tbnet_xmit_csum_and_map looks good to me, and so do
tbnet_kmap_frag and tbnet_start_xmit. I'm going to explore a bit of the Rx side
instead.

Any suggestions?

Thanks,
RB