netdev - Re: [PATCH net-next 1/3] tcp: fix __tcp_close() to only send RST when required

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAAVpQUBgCyC+y+2M7=WKJVk=sivgeZtE2kwCxDLFCrgezycjZg@mail.gmail.com>
Date: Wed, 3 Sep 2025 22:31:58 -0700
From: Kuniyuki Iwashima <kuniyu@...gle.com>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: Eric Dumazet <edumazet@...gle.com>, "David S . Miller" <davem@...emloft.net>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	Neal Cardwell <ncardwell@...gle.com>, Simon Horman <horms@...nel.org>, netdev@...r.kernel.org, 
	eric.dumazet@...il.com
Subject: Re: [PATCH net-next 1/3] tcp: fix __tcp_close() to only send RST when required

On Wed, Sep 3, 2025 at 10:04 PM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> On Wed, Sep 3, 2025 at 4:47 PM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > If the receive queue contains payload that was already
> > received, __tcp_close() can send an unexpected RST.
> >
> > Refine the code to take tp->copied_seq into account,
> > as we already do in tcp recvmsg().
> >
> > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>
> Sorry, Eric. I might be wrong, and I don't think it's a bugfix for now.
>
> IIUC, it's not possible that one skb stays in the receive queue and
> all of the data has been consumed in tcp_recvmsg() unless it's
> MSG_PEEK mode. So my understanding is that the patch tries to cover
> the case where partial data of skb is read by applications and the
> whole skb has not been unlinked from the receive queue yet. Sure, as
> we can learn from tcp_sendsmg(), skb can be partially read.

You can find a clear example in patch 2 that this patch fixes.

Without patch 1, the test fails:

# ./ksft_runner.sh tcp_close_no_rst.pkt
...
tcp_close_no_rst.pkt:32: error handling packet: live packet field
tcp_fin: expected: 1 (0x1) vs actual: 0 (0x0)
script packet:  0.140854 F. 1:1(0) ack 1002
actual packet:  0.140844 R. 1:1(0) ack 1002 win 65535
not ok 1 ipv4


>
> As long as 'TCP_SKB_CB(skb)->end_seq - TCP_SKB_CB(skb)->seq' has data
> len, and the skb still exists in the receive queue, it can directly
> means some part of skb hasn't been read yet. We can call it the unread
> data case then, so the logic before this patch is right.
>
> Two conditions (1. skb still stays in the queue, 2. skb has data) make
> sure that the data unread case can be detected and then sends an RST.
> No need to replace it with copied_seq, I wonder? At least, it's not a
> bug.
>
> Thanks,
> Jason
>
>
>
>
> > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > ---
> >  net/ipv4/tcp.c | 9 +++++----
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index 40b774b4f587..39eb03f6d07f 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -3099,8 +3099,8 @@ bool tcp_check_oom(const struct sock *sk, int shift)
> >
> >  void __tcp_close(struct sock *sk, long timeout)
> >  {
> > +       bool data_was_unread = false;
> >         struct sk_buff *skb;
> > -       int data_was_unread = 0;
> >         int state;
> >
> >         WRITE_ONCE(sk->sk_shutdown, SHUTDOWN_MASK);
> > @@ -3119,11 +3119,12 @@ void __tcp_close(struct sock *sk, long timeout)
> >          *  reader process may not have drained the data yet!
> >          */
> >         while ((skb = __skb_dequeue(&sk->sk_receive_queue)) != NULL) {
> > -               u32 len = TCP_SKB_CB(skb)->end_seq - TCP_SKB_CB(skb)->seq;
> > +               u32 end_seq = TCP_SKB_CB(skb)->end_seq;
> >
> >                 if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
> > -                       len--;
> > -               data_was_unread += len;
> > +                       end_seq--;
> > +               if (after(end_seq, tcp_sk(sk)->copied_seq))
> > +                       data_was_unread = true;
> >                 __kfree_skb(skb);
> >         }
> >
> > --
> > 2.51.0.338.gd7d06c2dae-goog
> >
> >