[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAO3-Pbpep00Up+WADoxun7kDvMruHSpydU6fEmyhnxM+o_jP1g@mail.gmail.com>
Date: Thu, 30 Jan 2025 12:34:53 -0600
From: Yan Zhai <yan@...udflare.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
David Ahern <dsahern@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Shuah Khan <shuah@...nel.org>, Josh Hunt <johunt@...mai.com>,
Alexander Duyck <alexander.h.duyck@...ux.intel.com>, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org, kernel-team@...udflare.com
Subject: Re: [PATCH v2 net] udp: gso: do not drop small packets when PMTU reduces
On Thu, Jan 30, 2025 at 9:35 AM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> Yan Zhai wrote:
> > Commit 4094871db1d6 ("udp: only do GSO if # of segs > 1") avoided GSO
> > for small packets. But the kernel currently dismisses GSO requests only
> > after checking MTU/PMTU on gso_size. This means any packets, regardless
> > of their payload sizes, could be dropped when PMTU becomes smaller than
> > requested gso_size. We encountered this issue in production and it
> > caused a reliability problem that new QUIC connection cannot be
> > established before PMTU cache expired, while non GSO sockets still
> > worked fine at the same time.
> >
> > Ideally, do not check any GSO related constraints when payload size is
> > smaller than requested gso_size, and return EMSGSIZE instead of EINVAL
> > on MTU/PMTU check failure to be more specific on the error cause.
> >
> > Fixes: 4094871db1d6 ("udp: only do GSO if # of segs > 1")
> > Signed-off-by: Yan Zhai <yan@...udflare.com>
> > --
> > v1->v2: add a missing MTU check when fall back to no GSO mode suggested
> > by Willem de Bruijn <willemdebruijn.kernel@...il.com>; Fixed up commit
> > message to be more precise.
> >
> > v1: https://lore.kernel.org/all/Z5cgWh%2F6bRQm9vVU@debian.debian/
> > ---
> > net/ipv4/udp.c | 28 +++++++++++++++++++---------
> > net/ipv6/udp.c | 28 +++++++++++++++++++---------
> > tools/testing/selftests/net/udpgso.c | 14 ++++++++++++++
> > 3 files changed, 52 insertions(+), 18 deletions(-)
> >
> > diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> > index c472c9a57cf6..0b5010238d05 100644
> > --- a/net/ipv4/udp.c
> > +++ b/net/ipv4/udp.c
> > @@ -1141,9 +1141,20 @@ static int udp_send_skb(struct sk_buff *skb, struct flowi4 *fl4,
> > const int hlen = skb_network_header_len(skb) +
> > sizeof(struct udphdr);
> >
> > + if (datalen <= cork->gso_size) {
> > + /*
> > + * check MTU again: it's skipped previously when
> > + * gso_size != 0
> > + */
> > + if (hlen + datalen > cork->fragsize) {
> > + kfree_skb(skb);
> > + return -EMSGSIZE;
> > + }
> > + goto no_gso;
>
> This is almost the same as the test below.
>
> How about just
>
> if (hlen + min(cork->gso_size, datalen) > cork->fragsize)
>
> And don't bypass the subsequent checks with a goto, or modify the
> rest of the code.
>
> > + }
> > if (hlen + cork->gso_size > cork->fragsize) {
> > kfree_skb(skb);
> > - return -EINVAL;
> > + return -EMSGSIZE;
> > }
> > if (datalen > cork->gso_size * UDP_MAX_SEGMENTS) {
> > kfree_skb(skb);
> > @@ -1158,17 +1169,16 @@ static int udp_send_skb(struct sk_buff *skb, struct flowi4 *fl4,
> > return -EIO;
> > }
> >
> > - if (datalen > cork->gso_size) {
> > - skb_shinfo(skb)->gso_size = cork->gso_size;
> > - skb_shinfo(skb)->gso_type = SKB_GSO_UDP_L4;
> > - skb_shinfo(skb)->gso_segs = DIV_ROUND_UP(datalen,
> > - cork->gso_size);
> > + skb_shinfo(skb)->gso_size = cork->gso_size;
> > + skb_shinfo(skb)->gso_type = SKB_GSO_UDP_L4;
> > + skb_shinfo(skb)->gso_segs = DIV_ROUND_UP(datalen,
> > + cork->gso_size);
> >
> > - /* Don't checksum the payload, skb will get segmented */
> > - goto csum_partial;
> > - }
> > + /* Don't checksum the payload, skb will get segmented */
> > + goto csum_partial;
> > }
> >
> > +no_gso:
> > if (is_udplite) /* UDP-Lite */
> > csum = udplite_csum(skb);
> >
> > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > index 6671daa67f4f..d97befa7f80d 100644
> > --- a/net/ipv6/udp.c
> > +++ b/net/ipv6/udp.c
> > @@ -1389,9 +1389,20 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6,
> > const int hlen = skb_network_header_len(skb) +
> > sizeof(struct udphdr);
> >
> > + if (datalen <= cork->gso_size) {
> > + /*
> > + * check MTU again: it's skipped previously when
> > + * gso_size != 0
> > + */
> > + if (hlen + datalen > cork->fragsize) {
> > + kfree_skb(skb);
> > + return -EMSGSIZE;
> > + }
> > + goto no_gso;
> > + }
> > if (hlen + cork->gso_size > cork->fragsize) {
> > kfree_skb(skb);
> > - return -EINVAL;
> > + return -EMSGSIZE;
> > }
> > if (datalen > cork->gso_size * UDP_MAX_SEGMENTS) {
> > kfree_skb(skb);
> > @@ -1406,17 +1417,16 @@ static int udp_v6_send_skb(struct sk_buff *skb, struct flowi6 *fl6,
> > return -EIO;
> > }
> >
> > - if (datalen > cork->gso_size) {
> > - skb_shinfo(skb)->gso_size = cork->gso_size;
> > - skb_shinfo(skb)->gso_type = SKB_GSO_UDP_L4;
> > - skb_shinfo(skb)->gso_segs = DIV_ROUND_UP(datalen,
> > - cork->gso_size);
> > + skb_shinfo(skb)->gso_size = cork->gso_size;
> > + skb_shinfo(skb)->gso_type = SKB_GSO_UDP_L4;
> > + skb_shinfo(skb)->gso_segs = DIV_ROUND_UP(datalen,
> > + cork->gso_size);
> >
> > - /* Don't checksum the payload, skb will get segmented */
> > - goto csum_partial;
> > - }
> > + /* Don't checksum the payload, skb will get segmented */
> > + goto csum_partial;
> > }
> >
> > +no_gso:
> > if (is_udplite)
> > csum = udplite_csum(skb);
> > else if (udp_get_no_check6_tx(sk)) { /* UDP csum disabled */
> > diff --git a/tools/testing/selftests/net/udpgso.c b/tools/testing/selftests/net/udpgso.c
> > index 3f2fca02fec5..fb73f1c331fb 100644
> > --- a/tools/testing/selftests/net/udpgso.c
> > +++ b/tools/testing/selftests/net/udpgso.c
> > @@ -102,6 +102,13 @@ struct testcase testcases_v4[] = {
> > .gso_len = CONST_MSS_V4,
> > .r_num_mss = 1,
> > },
> > + {
> > + /* datalen <= MSS < gso_len: will fall back to no GSO */
> > + .tlen = CONST_MSS_V4,
> > + .gso_len = CONST_MSS_V4 + 1,
> > + .r_num_mss = 0,
> > + .r_len_last = CONST_MSS_V4,
> > + },
>
> Please also add a test where datalen > MSS < gso_len (with .tfail = true).
>
ACK on both comments.
Thanks
Yan
> > {
> > /* send a single MSS + 1B */
> > .tlen = CONST_MSS_V4 + 1,
> > @@ -205,6 +212,13 @@ struct testcase testcases_v6[] = {
> > .gso_len = CONST_MSS_V6,
> > .r_num_mss = 1,
> > },
> > + {
> > + /* datalen <= MSS < gso_len: will fall back to no GSO */
> > + .tlen = CONST_MSS_V6,
> > + .gso_len = CONST_MSS_V6 + 1,
> > + .r_num_mss = 0,
> > + .r_len_last = CONST_MSS_V6,
> > + },
> > {
> > /* send a single MSS + 1B */
> > .tlen = CONST_MSS_V6 + 1,
> > --
> > 2.30.2
> >
> >
>
>
Powered by blists - more mailing lists