lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <E0922A4A-5715-4758-B067-ACB401BDB363@bejarano.io>
Date: Thu, 4 Sep 2025 10:56:29 +0200
From: Ricard Bejarano <ricard@...arano.io>
To: Ido Schimmel <idosch@...sch.org>
Cc: Andrew Lunn <andrew@...n.ch>,
 Mika Westerberg <mika.westerberg@...ux.intel.com>,
 netdev@...r.kernel.org,
 michael.jamet@...el.com,
 YehezkelShB@...il.com,
 andrew+netdev@...n.ch,
 davem@...emloft.net,
 edumazet@...gle.com,
 kuba@...nel.org,
 pabeni@...hat.com
Subject: Re: Poor thunderbolt-net interface performance when bridged

> I wrote that it can happen with forwarded traffic, not necessarily
> bridged traffic. Section 6 from here [1] shows that you get 900+ Mb/s
> between blue and purple with UDP, whereas with TCP you only get around
> 5Mb/s.

My assumption was that, due to CRC checksum failures causing L2 loss at every
rx end, and because of TCP congestion control back-off, TCP bandwidth drops
exponentially with the number of hops.
So the problem is not so much the TCP vs. UDP bandwidth, but the L2 loss
caused by CRC errors. That L2 loss happens at the rx end because that's when
CRC checksums are checked and frames are dropped, but other than cable
problems I can only assume that's a bug in the tx end driver.
I believe that's why Andrew Lunn pointed at the driver's handling of SKBs with
fragments as the possible culprit, but the fix breaks the test completely.

> Assuming you are talking about [2], it shows 16763 errors out of 6360635
> received packets. That's 0.2%.

Those were aggregated counters, they include multiple tests including some
below the ~250-300Mb/s threshold where loss begins to appear.

> I suggest removing the custom patches and re-testing with TSO disabled
> (on both red and blue). If this doesn't help, you can try recording
> packet drops on blue like I suggested in the previous mail.

Disabling TSO on both ends didn't change the iperf results. I currently have
no bandwidth to do the perf tests.

Thanks,
RB

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ