lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADVnQyktk+XpvLuc6jZa5CpqoGyjzzzYJ5iJk3=Eh5JAGyNyVQ@mail.gmail.com>
Date: Tue, 10 Jun 2025 13:15:24 -0400
From: Neal Cardwell <ncardwell@...gle.com>
To: Eric Wheeler <netdev@...ts.ewheeler.net>
Cc: netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>, 
	Geumhwan Yu <geumhwan.yu@...sung.com>, Jakub Kicinski <kuba@...nel.org>, 
	Sasha Levin <sashal@...nel.org>, Yuchung Cheng <ycheng@...gle.com>, stable@...nel.org
Subject: Re: [BISECT] regression: tcp: fix to allow timestamp undo if no
 retransmits were sent

On Mon, Jun 9, 2025 at 1:45 PM Neal Cardwell <ncardwell@...gle.com> wrote:
>
> On Sat, Jun 7, 2025 at 7:26 PM Neal Cardwell <ncardwell@...gle.com> wrote:
> >
> > On Sat, Jun 7, 2025 at 6:54 PM Neal Cardwell <ncardwell@...gle.com> wrote:
> > >
> > > On Sat, Jun 7, 2025 at 3:13 PM Neal Cardwell <ncardwell@...gle.com> wrote:
> > > >
> > > > On Fri, Jun 6, 2025 at 6:34 PM Eric Wheeler <netdev@...ts.ewheeler.net> wrote:
> > > > >
> > > > > On Fri, 6 Jun 2025, Neal Cardwell wrote:
> > > > > > On Thu, Jun 5, 2025 at 9:33 PM Eric Wheeler <netdev@...ts.ewheeler.net> wrote:
> > > > > > >
> > > > > > > Hello Neal,
> > > > > > >
> > > > > > > After upgrading to Linux v6.6.85 on an older Supermicro SYS-2026T-6RFT+
> > > > > > > with an Intel 82599ES 10GbE NIC (ixgbe) linked to a Netgear GS728TXS at
> > > > > > > 10GbE via one SFP+ DAC (no bonding), we found TCP performance with
> > > > > > > existing devices on 1Gbit ports was <60Mbit; however, TCP with devices
> > > > > > > across the switch on 10Gbit ports runs at full 10GbE.
> > > > > > >
> > > > > > > Interestingly, the problem only presents itself when transmitting
> > > > > > > from Linux; receive traffic (to Linux) performs just fine:
> > > > > > >         ~60Mbit: Linux v6.6.85 =TX=> 10GbE -> switch -> 1GbE  -> device
> > > > > > >          ~1Gbit: device        =TX=>  1GbE -> switch -> 10GbE -> Linux v6.6.85
> > > > > > >
> > > > > > > Through bisection, we found this first-bad commit:
> > > > > > >
> > > > > > >         tcp: fix to allow timestamp undo if no retransmits were sent
> > > > > > >                 upstream:       e37ab7373696e650d3b6262a5b882aadad69bb9e
> > > > > > >                 stable 6.6.y:   e676ca60ad2a6fdeb718b5e7a337a8fb1591d45f

Hi Eric,

Do you have cycles to test a proposed fix patch developed by our team?

The attached patch should apply (with "git am") for any recent kernel
that has the "tcp: fix to allow timestamp undo if no retransmits were
sent" patch it is fixing. So you should be able to test it on top of
the 6.6 stable or 6.15 stable kernels you used earlier. Whichever is
easier.

If you have cycles to rerun your iperf test, with  tcpdump, nstat, and
ss instrumentation, that would be fantastic!

The patch passes our internal packetdrill test suite, including new
tests for this issue (based on the packetdrill scripts posted earlier
in this thread.

But it would be fantastic to directly confirm that this fixes your issue.

Thanks!
neal

Download attachment "0001-tcp-fix-tcp_packet_delayed-for-tcp_is_non_sack_preve.patch" of type "application/octet-stream" (4511 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ