lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <5DE64000-782A-492C-A653-7EB758D28283@bejarano.io>
Date: Mon, 26 May 2025 10:50:43 +0200
From: Ricard Bejarano <ricard@...arano.io>
To: Mika Westerberg <mika.westerberg@...ux.intel.com>
Cc: netdev@...r.kernel.org,
 michael.jamet@...el.com,
 YehezkelShB@...il.com,
 andrew+netdev@...n.ch,
 davem@...emloft.net,
 edumazet@...gle.com,
 kuba@...nel.org,
 pabeni@...hat.com
Subject: Re: Poor thunderbolt-net interface performance when bridged

Hey, thanks again for looking into this.

Yes, these are 8th generation Intel NUCs with Thunderbolt 3, not 4. And yes, the
cable I have used so far is Thunderbolt "compatible" not "certified", and it
doesn't have the lightning logo[1].

I am not convinced, though.


Part I: Thunderbolt 3
---------------------

I first ran into this issue a few months ago with a set of 3 12/13th generation
Intel NUCs, each of which has 2 Thunderbolt 4 ports, directly connected to each
other so as to form a ring network. When hopping through one of them, bandwidth
dropped from ~16Gbps to ~5Mbps. Both in routing and bridging. These 3 NUCs are
in "production" so I didn't want to use them as my test bench. They are rocking
"Thunderbolt 4 certified" cables with the lightning logo[2].

I could justify running any one of the following disruptive tests if you think
they would be helpful:

Note: A is connected to B, B to C, and C to A (to form a ring).

1) Configure A and C to route to each other via B if the A<->C link is down,
   then disconnect A<->C and run iperfs in all directions, like in [4.6].
   If they run at ~16Gbps when hopping via B, then TB3 was (at least part of)
   the problem; otherwise it must be something wrong with the driver.
   I am very confident speed will drop when hopping via B, because this is how I
   first came across this issue. I wanted nodes of the ring to use the other way
   around if the direct path wasn't up, but that wasn't possible due to the huge
   bandwidth drop.

2) Same as #1 but configure B to bridge both of its Thunderbolt interfaces.

3) While pulling the A<->C cable for running one of the above, test that cable
   in the 8th gen test bench. This cable is known to run at ~16Gbps when
   connecting A and C via their Thunderbolt 4 ports.
   While very unlikely, if this somehow solves the red->purple bandwidth, then
   we know the current cable was to blame.

These 12/13th gen NUCs are running non-upstream kernels, however, and while I
can justify playing around a bit with their connections, I can't justify pulling
them out of production to install upstream kernels and make them our test bench.

Do you think anyone of these tests would be helpful?


Part II: the cable
------------------

You also point to the cable as the likely culprit.

1) But then, why does iperf between red<->blue[4.6.1] show ~9Gbps both ways, but
   red->blue->purple[4.6.3a] drops to ~5Mbps? If the cable were to blame,
   wouldn't red->blue[4.6.1a] also drop to about the same?

2) Also, if the problem were the cable's bandwidth in the red->blue direction,
   flipping the cable around should show a similar bandwidth drop in the (now)
   blue->red direction, right?
   I have tested this and it doesn't hold true, iperfs in all directions after
   flipping the cable around gave about the same results as in [4.6], further
   pointing at something else other than the cable itself.

I've attached the output of 'tblist -Av'. It shows negotiated speed at 10Gb/s in
both Rx/Tx, which lines up with the red<->blue iperf bandwidth tests of [4.6.1].


How shall we proceed?

I reckon all my statements about the 12/13th gen NUCs are anecdata and not as
scientific as my 8th gen NUC results, but I'm happy to perform any one of the
three tests above.


Thanks again,
Ricard Bejarano

--
[1] https://www.amazon.es/-/en/dp/B0C93G2M83
[2] https://www.amazon.es/-/en/dp/B095KSL2B9


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ