[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJnQErC8OLoTgnNxU8MURKANbiqXBYaUHsNaTO3m+P54Q@mail.gmail.com>
Date: Thu, 16 Oct 2025 09:10:54 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Kuniyuki Iwashima <kuniyu@...gle.com>
Cc: Neal Cardwell <ncardwell@...gle.com>, "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Yuchung Cheng <ycheng@...gle.com>, Willem de Bruijn <willemb@...gle.com>,
Kuniyuki Iwashima <kuni1840@...il.com>, netdev@...r.kernel.org
Subject: Re: [PATCH v1 net-next 1/4] tcp: Make TFO client fallback behaviour consistent.
On Wed, Oct 15, 2025 at 9:02 PM Kuniyuki Iwashima <kuniyu@...gle.com> wrote:
>
> In tcp_send_syn_data(), the TCP Fast Open client could give up
> embedding payload into SYN, but the behaviour is inconsistent.
>
> 1. Send a bare SYN with TFO request (option w/o cookie)
> 2. Send a bare SYN with TFO cookie
>
> When the client does not have a valid cookie, a bare SYN is
> sent with the TFO option without a cookie.
>
> When sendmsg(MSG_FASTOPEN) is called with zero payload and the
> client has a valid cookie, a bare SYN is sent with the TFO
> cookie, which is confusing.
>
> This also happens when tcp_wmem_schedule() fails to charge
> non-zero payload.
>
> OTOH, other fallback paths align with 1. In this case, a TFO
> request is not strictly needed as tcp_fastopen_cookie_check()
> has succeeded, but we can use this round to refresh the TFO
> cookie.
>
> Let's avoid sending TFO cookie w/o payload to make fallback
> behaviour consistent.
>
I am unsure. Some applications could break ?
They might prime the cookie cache initiating a TCP flow with no payload,
so that later at critical times then can save one RTT at their
connection establishment.
> Signed-off-by: Kuniyuki Iwashima <kuniyu@...gle.com>
> ---
> net/ipv4/tcp_output.c | 39 +++++++++++++++++++++------------------
> 1 file changed, 21 insertions(+), 18 deletions(-)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index bb3576ac0ad7d..2847c1ffa1615 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -4151,6 +4151,9 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn)
> if (!tcp_fastopen_cookie_check(sk, &tp->rx_opt.mss_clamp, &fo->cookie))
> goto fallback;
>
> + if (!fo->size)
> + goto fallback;
> +
> /* MSS for SYN-data is based on cached MSS and bounded by PMTU and
> * user-MSS. Reserve maximum option space for middleboxes that add
> * private TCP options. The cost is reduced data space in SYN :(
> @@ -4164,33 +4167,33 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn)
>
> space = min_t(size_t, space, fo->size);
>
> - if (space &&
> - !skb_page_frag_refill(min_t(size_t, space, PAGE_SIZE),
> + if (!skb_page_frag_refill(min_t(size_t, space, PAGE_SIZE),
> pfrag, sk->sk_allocation))
> goto fallback;
> +
> syn_data = tcp_stream_alloc_skb(sk, sk->sk_allocation, false);
> if (!syn_data)
> goto fallback;
> +
> memcpy(syn_data->cb, syn->cb, sizeof(syn->cb));
> - if (space) {
> - space = min_t(size_t, space, pfrag->size - pfrag->offset);
> - space = tcp_wmem_schedule(sk, space);
> - }
> - if (space) {
> +
> + space = min_t(size_t, space, pfrag->size - pfrag->offset);
> + space = tcp_wmem_schedule(sk, space);
> + if (space)
> space = copy_page_from_iter(pfrag->page, pfrag->offset,
> space, &fo->data->msg_iter);
> - if (unlikely(!space)) {
> - tcp_skb_tsorted_anchor_cleanup(syn_data);
> - kfree_skb(syn_data);
> - goto fallback;
> - }
> - skb_fill_page_desc(syn_data, 0, pfrag->page,
> - pfrag->offset, space);
> - page_ref_inc(pfrag->page);
> - pfrag->offset += space;
> - skb_len_add(syn_data, space);
> - skb_zcopy_set(syn_data, fo->uarg, NULL);
> + if (unlikely(!space)) {
> + tcp_skb_tsorted_anchor_cleanup(syn_data);
> + kfree_skb(syn_data);
> + goto fallback;
> }
> +
> + skb_fill_page_desc(syn_data, 0, pfrag->page, pfrag->offset, space);
> + page_ref_inc(pfrag->page);
> + pfrag->offset += space;
> + skb_len_add(syn_data, space);
> + skb_zcopy_set(syn_data, fo->uarg, NULL);
> +
> /* No more data pending in inet_wait_for_connect() */
> if (space == fo->size)
> fo->data = NULL;
> --
> 2.51.0.788.g6d19910ace-goog
>
Powered by blists - more mailing lists