netdev - Re: Poor thunderbolt-net interface performance when bridged

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250526092220.GO88033@black.fi.intel.com>
Date: Mon, 26 May 2025 12:22:20 +0300
From: Mika Westerberg <mika.westerberg@...ux.intel.com>
To: Ricard Bejarano <ricard@...arano.io>
Cc: netdev@...r.kernel.org, michael.jamet@...el.com, YehezkelShB@...il.com,
	andrew+netdev@...n.ch, davem@...emloft.net, edumazet@...gle.com,
	kuba@...nel.org, pabeni@...hat.com
Subject: Re: Poor thunderbolt-net interface performance when bridged

On Mon, May 26, 2025 at 10:50:43AM +0200, Ricard Bejarano wrote:
> Hey, thanks again for looking into this.

No problem.

> Yes, these are 8th generation Intel NUCs with Thunderbolt 3, not 4. And yes, the
> cable I have used so far is Thunderbolt "compatible" not "certified", and it
> doesn't have the lightning logo[1].
> 
> I am not convinced, though.
> 
> Part I: Thunderbolt 3
> ---------------------
> 
> I first ran into this issue a few months ago with a set of 3 12/13th generation
> Intel NUCs, each of which has 2 Thunderbolt 4 ports, directly connected to each
> other so as to form a ring network. When hopping through one of them, bandwidth
> dropped from ~16Gbps to ~5Mbps. Both in routing and bridging. These 3 NUCs are
> in "production" so I didn't want to use them as my test bench. They are rocking
> "Thunderbolt 4 certified" cables with the lightning logo[2].
> 
> I could justify running any one of the following disruptive tests if you think
> they would be helpful:
> 
> Note: A is connected to B, B to C, and C to A (to form a ring).

I suggest keeping the "test case" as simple as possible.

Simple peer-to-peer, no routing nothing. Anything else is making things
hard to debug. Also note that this whole thing is supposed to be used as
peer-to-peer not some full fledged networking solution.

> 1) Configure A and C to route to each other via B if the A<->C link is down,
>    then disconnect A<->C and run iperfs in all directions, like in [4.6].
>    If they run at ~16Gbps when hopping via B, then TB3 was (at least part of)
>    the problem; otherwise it must be something wrong with the driver.
>    I am very confident speed will drop when hopping via B, because this is how I
>    first came across this issue. I wanted nodes of the ring to use the other way
>    around if the direct path wasn't up, but that wasn't possible due to the huge
>    bandwidth drop.
> 
> 2) Same as #1 but configure B to bridge both of its Thunderbolt interfaces.
> 
> 3) While pulling the A<->C cable for running one of the above, test that cable
>    in the 8th gen test bench. This cable is known to run at ~16Gbps when
>    connecting A and C via their Thunderbolt 4 ports.
>    While very unlikely, if this somehow solves the red->purple bandwidth, then
>    we know the current cable was to blame.
> 
> These 12/13th gen NUCs are running non-upstream kernels, however, and while I
> can justify playing around a bit with their connections, I can't justify pulling
> them out of production to install upstream kernels and make them our test bench.
> 
> Do you think anyone of these tests would be helpful?

Let's forget bridges for now and anything else than this:

  Host A <- Thunderbolt Cable -> Host B

> Part II: the cable
> ------------------
> 
> You also point to the cable as the likely culprit.
> 
> 1) But then, why does iperf between red<->blue[4.6.1] show ~9Gbps both ways, but
>    red->blue->purple[4.6.3a] drops to ~5Mbps? If the cable were to blame,
>    wouldn't red->blue[4.6.1a] also drop to about the same?

I'm saying two things that will for sure limit the maximum throughput you
get for a fact:

 1. You use non-certified cables, so your are limited to 10 Gb/s per lane
    instead of 20 Gb/s per lane.

 2. Your system has firmware connection manager which does not support lane
    bonding so instead of your 2 x 10 Gb/s = 20 Gb/s you only get the 1 x 10
    Gb/s.

It is enough if one of the hosts has these limitations it will affect the
whole link. So instead of 40 Gb/s with lane bonding you get 10 Gb/s
(although there are some limitations in the DMA side so you don't get the
full 40 Gb/s but certainly more than what the 10 Gb/s single lane gives
you).

> 2) Also, if the problem were the cable's bandwidth in the red->blue direction,
>    flipping the cable around should show a similar bandwidth drop in the (now)
>    blue->red direction, right?
>    I have tested this and it doesn't hold true, iperfs in all directions after
>    flipping the cable around gave about the same results as in [4.6], further
>    pointing at something else other than the cable itself.

You can check the link speed using the tool I referred. It may be that
sometimes it manages to negotiate the 20 Gb/s link but sometimes not.

> I've attached the output of 'tblist -Av'. It shows negotiated speed at 10Gb/s in
> both Rx/Tx, which lines up with the red<->blue iperf bandwidth tests of [4.6.1].

You missed the attachment? But anyways as I suspected it shows the same.

> How shall we proceed?

Well, if the link is degraded to 10 Gb/s then I'm not sure there is
nothing more I can do here.

If it is not the case, e.g you see that the link is 40 Gb/s but you still
see crappy throughput the we need to investigate (but keep the topology as
simple as possible). Note in this case please provide full dmesg (with
thunderbolt.dyndbg=+p) on both sides of the link and I can take a look.