lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iL8YKZZQZSmg5WqrYVtyd2PanNXzTZ2Z0cObpv9_XSmoQ@mail.gmail.com>
Date: Tue, 14 Oct 2025 01:54:24 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, 
	Simon Horman <horms@...nel.org>, Neal Cardwell <ncardwell@...gle.com>, 
	Willem de Bruijn <willemb@...gle.com>, Kuniyuki Iwashima <kuniyu@...gle.com>, netdev@...r.kernel.org, 
	eric.dumazet@...il.com
Subject: Re: [PATCH net-next] tcp: better handle TCP_TX_DELAY on established flows

On Tue, Oct 14, 2025 at 1:29 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Tue, Oct 14, 2025 at 1:22 AM Paolo Abeni <pabeni@...hat.com> wrote:
> >
> > On 10/13/25 4:59 PM, Eric Dumazet wrote:
> > > Some applications uses TCP_TX_DELAY socket option after TCP flow
> > > is established.
> > >
> > > Some metrics need to be updated, otherwise TCP might take time to
> > > adapt to the new (emulated) RTT.
> > >
> > > This patch adjusts tp->srtt_us, tp->rtt_min, icsk_rto
> > > and sk->sk_pacing_rate.
> > >
> > > This is best effort, and for instance icsk_rto is reset
> > > without taking backoff into account.
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> >
> > The CI is consistently reporting pktdrill failures on top of this patch:
> >
> > # selftests: net/packetdrill: tcp_user_timeout_user-timeout-probe.pkt
> > # TAP version 13
> > # 1..2
> > # tcp_user_timeout_user-timeout-probe.pkt:35: error in Python code
> > # Traceback (most recent call last):
> > #   File "/tmp/code_T7S7S4", line 202, in <module>
> > #     assert tcpi_probes == 6, tcpi_probes; \
> > # AssertionError: 0
> > # tcp_user_timeout_user-timeout-probe.pkt: error executing code:
> > 'python3' returned non-zero status 1
> >
> > To be accurate, the patches batch under tests also includes:
> >
> > https://patchwork.kernel.org/project/netdevbpf/list/?series=1010780
> >
> > but the latter looks even more unlikely to cause the reported issues?!?

Not sure, look at the packetdrill test "`tc qdisc delete dev tun0 root
2>/dev/null ; tc qdisc add dev tun0 root pfifo limit 0`"

After "net: dev_queue_xmit() llist adoption" __dev_xmit_skb() might
return NET_XMIT_SUCCESS instead of NET_XMIT_DROP

__tcp_transmit_skb() has some code to detect NET_XMIT_DROP
immediately, instead of relying on a timer.

I can fix the 'single packet' case, but not the case of many packets
being sent in //

Note this issue was there already, for qdisc with TCQ_F_CAN_BYPASS :
We were returning NET_XMIT_SUCCESS even if the driver had to drop the packet.

Test is flaky even without the
https://patchwork.kernel.org/project/netdevbpf/list/?series=1010780
series.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ