[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iL355opLJTJUFmiuy7GW5mu9NmihN-xoAtV2=RFVMO3qg@mail.gmail.com>
Date: Mon, 20 Jan 2025 13:34:38 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Zhongqiu Duan <dzq.aishenghu0@...il.com>
Cc: netdev@...r.kernel.org, Jason Xing <kerneljasonxing@...il.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>, "David S. Miller" <davem@...emloft.net>,
David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>
Subject: Re: [RFC PATCH] tcp: fill the one wscale sized window to trigger zero
window advertising
On Mon, Jan 20, 2025 at 1:30 PM Zhongqiu Duan <dzq.aishenghu0@...il.com> wrote:
>
> On Sat, Jan 18, 2025 at 3:29 AM Zhongqiu Duan <dzq.aishenghu0@...il.com> wrote:
> >
> > If the rcvbuf of a slow receiver is full, the packet will be dropped
> > because tcp_try_rmem_schedule() cannot schedule more memory for it.
> > Usually the scaled window size is not MSS aligned. If the receiver
> > advertised a one wscale sized window is in (MSS, 2*MSS), and GSO/TSO is
> > disabled, we need at least two packets to fill it. But the receiver will
> > not ACK the first one, and also do not offer a zero window since we never
> > shrink the offered window.
> > The sender waits for the ACK because the send window is not enough for
> > another MSS sized packet, tcp_snd_wnd_test() will return false, and
> > starts the TLP and then the retransmission timer for the first packet
> > until it is ACKed.
> > It may take a long time to resume transmission from retransmission after
> > the receiver clears the rcvbuf, depends on the times of retransmissions.
> >
> > This issue should be rare today as GSO/TSO is a common technology,
> > especially after 0a6b2a1dc2a2 ("tcp: switch to GSO being always on") or
> > commit d0d598ca86bd ("net: remove sk_route_forced_caps").
> > We can reproduce it by reverting commit 0a6b2a1dc2a2 and disabling hw
> > GSO/TSO from nic using ethtool (a). Or enabling MD5SIG (b).
> >
> > Force split a large packet and send it to fill the window so that the
> > receiver can offer a zero window if he want.
> >
> > Reproduce:
> >
> > 1. Set a large number for net.core.rmem_max on the RECV side to provide
> > a large wscale value of 11, which will provide a 2048 window larger
> > than the normal MSS 1448.
> > Set a slightly lower value for net.ipv4.tcp_rmem on the RECV side to
> > quickly trigger the problem. (optional)
> >
> > sysctl net.core.rmem_max=67108864
> > sysctl net.ipv4.tcp_rmem="4096 131072 262144"
> >
> > 2. (a) Build customized kernel with 0a6b2a1dc2a2 reverted and disabling
> > the GSO/TSO of nic on the SEND side.
> > (b) Or setup the xfrm tunnel with esp proto and aead rfc4106(gcm(aes))
> > algo. (Namespace and veth is okay, helper xfrm.sh is at the end.)
>
> Sorry, I mixed up some things in the test environment. So the xfrm setup
> is completely unnecessary in this reproduce. Just preparing an MD5SIG
> enabled tcp tool is enough for method (b).
>
> It's easy to reproduce in distros, what we should do is make a slightly
> large wscale and make sure that the GSO is disabled in sk_setup_caps().
Please provide a packetdrill test.
I am sorry, I am seeing too many suspect reports these days, with
vague descriptions.
Powered by blists - more mailing lists