[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLO1JOw8LKoxAYm5M_tXDFyiSVhEovOPsSf9H08gPwj_Q@mail.gmail.com>
Date: Wed, 3 Sep 2025 23:44:28 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: Kuniyuki Iwashima <kuniyu@...gle.com>, "David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Neal Cardwell <ncardwell@...gle.com>, Simon Horman <horms@...nel.org>, netdev@...r.kernel.org,
eric.dumazet@...il.com
Subject: Re: [PATCH net-next 1/3] tcp: fix __tcp_close() to only send RST when required
On Wed, Sep 3, 2025 at 11:19 PM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> On Thu, Sep 4, 2025 at 1:32 PM Kuniyuki Iwashima <kuniyu@...gle.com> wrote:
> >
> > On Wed, Sep 3, 2025 at 10:04 PM Jason Xing <kerneljasonxing@...il.com> wrote:
> > >
> > > On Wed, Sep 3, 2025 at 4:47 PM Eric Dumazet <edumazet@...gle.com> wrote:
> > > >
> > > > If the receive queue contains payload that was already
> > > > received, __tcp_close() can send an unexpected RST.
> > > >
> > > > Refine the code to take tp->copied_seq into account,
> > > > as we already do in tcp recvmsg().
> > > >
> > > > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> > >
> > > Sorry, Eric. I might be wrong, and I don't think it's a bugfix for now.
> > >
> > > IIUC, it's not possible that one skb stays in the receive queue and
> > > all of the data has been consumed in tcp_recvmsg() unless it's
> > > MSG_PEEK mode. So my understanding is that the patch tries to cover
> > > the case where partial data of skb is read by applications and the
> > > whole skb has not been unlinked from the receive queue yet. Sure, as
> > > we can learn from tcp_sendsmg(), skb can be partially read.
> >
> > You can find a clear example in patch 2 that this patch fixes.
>
> Oh, great, a very interesting corner case: resending data with FIN....
Linux TCP stack under memory pressure can do that BTW, no need for
another implementation :)
tcp_send_fin()
> I just wasn't able to read the second patch in time...
>
> Reviewed-by: Jason Xing <kerneljasonxing@...il.com>
>
> Thanks,
> Jason
>
> >
> > Without patch 1, the test fails:
> >
> > # ./ksft_runner.sh tcp_close_no_rst.pkt
> > ...
> > tcp_close_no_rst.pkt:32: error handling packet: live packet field
> > tcp_fin: expected: 1 (0x1) vs actual: 0 (0x0)
> > script packet: 0.140854 F. 1:1(0) ack 1002
> > actual packet: 0.140844 R. 1:1(0) ack 1002 win 65535
> > not ok 1 ipv4
> >
> >
> > >
> > > As long as 'TCP_SKB_CB(skb)->end_seq - TCP_SKB_CB(skb)->seq' has data
> > > len, and the skb still exists in the receive queue, it can directly
> > > means some part of skb hasn't been read yet. We can call it the unread
> > > data case then, so the logic before this patch is right.
> > >
> > > Two conditions (1. skb still stays in the queue, 2. skb has data) make
> > > sure that the data unread case can be detected and then sends an RST.
> > > No need to replace it with copied_seq, I wonder? At least, it's not a
> > > bug.
> > >
> > > Thanks,
> > > Jason
> > >
> > >
> > >
> > >
> > > > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > > > ---
> > > > net/ipv4/tcp.c | 9 +++++----
> > > > 1 file changed, 5 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > > > index 40b774b4f587..39eb03f6d07f 100644
> > > > --- a/net/ipv4/tcp.c
> > > > +++ b/net/ipv4/tcp.c
> > > > @@ -3099,8 +3099,8 @@ bool tcp_check_oom(const struct sock *sk, int shift)
> > > >
> > > > void __tcp_close(struct sock *sk, long timeout)
> > > > {
> > > > + bool data_was_unread = false;
> > > > struct sk_buff *skb;
> > > > - int data_was_unread = 0;
> > > > int state;
> > > >
> > > > WRITE_ONCE(sk->sk_shutdown, SHUTDOWN_MASK);
> > > > @@ -3119,11 +3119,12 @@ void __tcp_close(struct sock *sk, long timeout)
> > > > * reader process may not have drained the data yet!
> > > > */
> > > > while ((skb = __skb_dequeue(&sk->sk_receive_queue)) != NULL) {
> > > > - u32 len = TCP_SKB_CB(skb)->end_seq - TCP_SKB_CB(skb)->seq;
> > > > + u32 end_seq = TCP_SKB_CB(skb)->end_seq;
> > > >
> > > > if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
> > > > - len--;
> > > > - data_was_unread += len;
> > > > + end_seq--;
> > > > + if (after(end_seq, tcp_sk(sk)->copied_seq))
> > > > + data_was_unread = true;
> > > > __kfree_skb(skb);
> > > > }
> > > >
> > > > --
> > > > 2.51.0.338.gd7d06c2dae-goog
> > > >
> > > >
Powered by blists - more mailing lists