netdev - Re: [PATCH v3 2/2] net/tcp: add packetdrill test for FIN-WAIT-1 zero-window fix

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAAVpQUBAKZvSpcrOiGXPXwzEZL5soXBMGZca6qcsAahyWLXRKQ@mail.gmail.com>
Date: Mon, 27 Oct 2025 11:15:25 -0700
From: Kuniyuki Iwashima <kuniyu@...gle.com>
To: Neal Cardwell <ncardwell@...gle.com>
Cc: HaiYang Zhong <wokezhong@...il.com>, davem@...emloft.net, dsahern@...nel.org, 
	edumazet@...gle.com, horms@...nel.org, kuba@...nel.org, 
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org, pabeni@...hat.com, 
	wokezhong@...cent.com
Subject: Re: [PATCH v3 2/2] net/tcp: add packetdrill test for FIN-WAIT-1
 zero-window fix

On Mon, Oct 27, 2025 at 8:08 AM Neal Cardwell <ncardwell@...gle.com> wrote:
>
> On Mon, Oct 27, 2025 at 10:15 AM HaiYang Zhong <wokezhong@...il.com> wrote:
> >
> > Move the packetdrill test to the packetdrill directory and shorten
> > the test duration.
> >
> > In the previous packetdrill test script, the long duration was due to
> > presenting the entire zero-window probe backoff process. The test has
> > been modified to only observe the first few packets to shorten the test
> > time while still effectively verifying the fix.
> >
> > - Moved test to tools/testing/selftests/net/packetdrill/
> > - Reduced test duration from 360+ seconds to under 4 seconds
> >
> > Signed-off-by: HaiYang Zhong <wokezhong@...cent.com>
> > ---
> >  .../packetdrill/tcp_fin_wait1_zero_window.pkt | 34 +++++++++++++++++++
> >  1 file changed, 34 insertions(+)
> >  create mode 100644 tools/testing/selftests/net/packetdrill/tcp_fin_wait1_zero_window.pkt
> >
> > diff --git a/tools/testing/selftests/net/packetdrill/tcp_fin_wait1_zero_window.pkt b/tools/testing/selftests/net/packetdrill/tcp_fin_wait1_zero_window.pkt
> > new file mode 100644
> > index 000000000000..854ede56e7dd
> > --- /dev/null
> > +++ b/tools/testing/selftests/net/packetdrill/tcp_fin_wait1_zero_window.pkt
> > @@ -0,0 +1,34 @@
> > +// Test for permanent FIN-WAIT-1 state with continuous zero-window advertisements
> > +// Author: HaiYang Zhong <wokezhong@...cent.com>
> > +
> > +
> > +0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
> > +0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> > +0.000 bind(3, ..., ...) = 0
> > +0.000 listen(3, 1) = 0
> > +
> > +0.100 < S 0:0(0) win 65535 <mss 1460>
> > +0.100 > S. 0:0(0) ack 1 <mss 1460>
> > +0.100 < . 1:1(0) ack 1 win 65535
> > +0.100 accept(3, ..., ...) = 4
> > +
> > +// Send data to fill receive window
> > +0.200 write(4, ..., 5) = 5
> > +0.200 > P. 1:6(5) ack 1
> > +
> > +// Advertise zero-window
> > +0.200 < . 1:1(0) ack 6 win 0
> > +
> > +// Application closes connection, sends FIN (but blocked by zero window)
> > +0.200 close(4) = 0
> > +
> > +//Send zero-window probe packet
> > ++0.200 > . 5:5(0) ack 1
> > ++0.400 > . 5:5(0) ack 1
> > ++0.800 > . 5:5(0) ack 1
> > +
> > ++1.000 < . 1:1(0) ack 6 win 0
> > +
> > +// Without fix: This probe won't match - timer was reset, probe will be sent 2.600s after the previous probe
> > +// With fix: This probe matches - exponential backoff continues (1.600s after previous probe)
> > ++0.600~+0.700 > . 5:5(0) ack 1
> > --
>
> Thanks for this test!
>
> Kuniyuki rightly raised a concern about the test execution time.
>
> But IMHO it was very nice that the original version of the test
> verified that the connection would eventually be timed out. With this
> shorter version of the test, AFAICT the test does not verify that the
> connection actually times out eventually.
>
> Perhaps if we tune the timeout settings we can achieve both (a) fast
> execution (say, less than 10 secs?), and (b) verify that the
> connection does time out?

+1 for the original test with shorter timeout settings.

If the test still takes longer than 45s with the Neal's
recommendation applied, we can adjust the kselftest timeout like :

diff --git a/tools/testing/selftests/net/packetdrill/Makefile
b/tools/testing/selftests/net/packetdrill/Makefile
index ff54641493e9..33e3311e3ef5 100644
--- a/tools/testing/selftests/net/packetdrill/Makefile
+++ b/tools/testing/selftests/net/packetdrill/Makefile
@@ -1,5 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0

+export kselftest_override_timeout=360
+
 TEST_INCLUDES := \
  defaults.sh \
  ksft_runner.sh \


Also, please post the next version as a separate thread so that
patchwork will not be confused.

https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html#resending-after-review
---8<---
The new version of patches should be posted as a separate thread,
not as a reply to the previous posting. Change log should include a
link to the previous posting
---8<---

Thanks!

>
> Perhaps you can try:
>
> + setting net.ipv4.tcp_orphan_retries to something small, like 3 or 4
> (instead of the default of 0, which dynamically sets the retry count
> to 8 in tcp_orphan_retries())
>
> + setting net.ipv4.tcp_rto_max_ms to something small, like 5000
> (instead of the default of 120000, aka 120 secs)
>
> Another thought: the original test injected a lot of extra rwin=0 ACKs
> that AFAICT a real remote peer would not have sent. IMHO it's better
> to keep the test simpler and more realistic by not having the test
> inject those extra rwin=0 ACKs.