[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CADVnQykJZhYQZMtqg+Cm4BnuVeaBjPi+VWseNK+K7Y4eZRq_-w@mail.gmail.com>
Date: Tue, 20 Jan 2026 14:35:24 -0500
From: Neal Cardwell <ncardwell@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: chia-yu.chang@...ia-bell-labs.com, pabeni@...hat.com, edumazet@...gle.com,
parav@...dia.com, linux-doc@...r.kernel.org, corbet@....net, horms@...nel.org,
dsahern@...nel.org, kuniyu@...gle.com, bpf@...r.kernel.org,
netdev@...r.kernel.org, dave.taht@...il.com, jhs@...atatu.com,
stephen@...workplumber.org, xiyou.wangcong@...il.com, jiri@...nulli.us,
davem@...emloft.net, andrew+netdev@...n.ch, donald.hunter@...il.com,
ast@...erby.net, liuhangbin@...il.com, shuah@...nel.org,
linux-kselftest@...r.kernel.org, ij@...nel.org,
koen.de_schepper@...ia-bell-labs.com, g.white@...lelabs.com,
ingemar.s.johansson@...csson.com, mirja.kuehlewind@...csson.com,
cheshire@...le.com, rs.ietf@....at, Jason_Livingood@...cast.com,
vidhi_goel@...le.com
Subject: Re: [PATCH v9 net-next 15/15] selftests/net: packetdrill: add TCP
Accurate ECN cases
On Tue, Jan 20, 2026 at 1:53 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Mon, 19 Jan 2026 19:58:52 +0100 chia-yu.chang@...ia-bell-labs.com
> wrote:
> > Linux Accurate ECN test sets using ACE counters and AccECN options to
> > cover several scenarios: Connection teardown, different ACK conditions,
> > counter wrapping, SACK space grabbing, fallback schemes, negotiation
> > retransmission/reorder/loss, AccECN option drop/loss, different
> > handshake reflectors, data with marking, and different sysctl values.
>
> Thank you for closing the packetdrill side, and big thanks to Neal
> for prioritizing getting it reviewed and merged!
>
> I updated the packetdrill build in netdev CI and looks like one of
> the cases is flaking a little. Since it looks like you'll have to
> respin, please try to fix:
>
> # 1..2
> # tcp_accecn_client_accecn_options_lost.pkt:32: error handling packet: timing error: expected outbound packet in relative time range +0.020000~+0.500000 sec but happened at +0.015816 sec
> # script packet: 0.181936 .5 1:1013(1012) ack 1 <ECN e1b 1 ceb 0 e0b 1,nop>
> # actual packet: 0.177752 .EA 1:1013(1012) ack 1 win 1050 <ECN e1b 1 ceb 0 e0b 1,nop>
> # not ok 1 ipv4
> # tcp_accecn_client_accecn_options_lost.pkt:32: error handling packet: timing error: expected outbound packet in relative time range +0.020000~+0.500000 sec but happened at +0.015800 sec
> # script packet: 0.181952 .5 1:1013(1012) ack 1 <ECN e1b 1 ceb 0 e0b 1,nop>
> # actual packet: 0.177752 .EA 1:1013(1012) ack 1 win 1050 <ECN e1b 1 ceb 0 e0b 1,nop>
> # not ok 2 ipv6
> # # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0
>
> https://netdev-ctrl.bots.linux.dev/logs/vmksft/packetdrill/results/482201/115-tcp-accecn-client-accecn-options-lost-pkt/stdout
Probably this is happening because the SRTT is around 56ms:
.050 * 7/8 + 1/8 * .1 = .05625 sec
So the RACK fast recovery starts afte rabout 15ms due to .25 * srtt
being about 14ms:
(.050 * 7/8 + 1/8 * .1) * .25 = .0140625 sec
If we make the SRTT 100ms then the fast retransmit should be around:
(.1 * 7/8 + 1/8 * .1) * .25 = .025 sec
So I'd suggest changing the timing of the SYNACK from 50ms to 100ms:
old:
+0.05 < [ect0] SW. 0:0(0) ack 1 win 32767 <mss 1024,ECN e0b 1 ceb 0
e1b 1,nop,nop,nop,sackOK,nop,wscale 8>
new:
+.1 < [ect0] SW. 0:0(0) ack 1 win 32767 <mss 1024,ECN e0b 1 ceb 0 e1b
1,nop,nop,nop,sackOK,nop,wscale 8>
neal
Powered by blists - more mailing lists