[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADVnQy=umLQzLWo3=ErOsDUHnoOzkg2Wf3SQnGNSfYE_f+rJFQ@mail.gmail.com>
Date: Tue, 10 Sep 2019 20:55:47 -0400
From: Neal Cardwell <ncardwell@...gle.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Eric Dumazet <eric.dumazet@...il.com>,
Soheil Hassas Yeganeh <soheil@...gle.com>,
Yuchung Cheng <ycheng@...gle.com>,
Daniel Borkmann <daniel@...earbox.net>,
Tariq Toukan <tariqt@...lanox.com>
Subject: Re: [PATCH net-next] tcp: force a PSH flag on TSO packets
On Tue, Sep 10, 2019 at 5:49 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> When tcp sends a TSO packet, adding a PSH flag on it
> reduces the sojourn time of GRO packet in GRO receivers.
>
> This is particularly the case under pressure, since RX queues
> receive packets for many concurrent flows.
>
> A sender can give a hint to GRO engines when it is
> appropriate to flush a super-packet, especially when pacing
> is in the picture, since next packet is probably delayed by
> one ms.
>
> Having less packets in GRO engine reduces chance
> of LRU eviction or inflated RTT, and reduces GRO cost.
>
> We found recently that we must not set the PSH flag on
> individual full-size MSS segments [1] :
>
> Under pressure (CWR state), we better let the packet sit
> for a small delay (depending on NAPI logic) so that the
> ACK packet is delayed, and thus next packet we send is
> also delayed a bit. Eventually the bottleneck queue can
> be drained. DCTCP flows with CWND=1 have demonstrated
> the issue.
>
> This patch allows to slowdown the aggregate traffic without
> involving high resolution timers on senders and/or
> receivers.
>
> It has been used at Google for about four years,
> and has been discussed at various networking conferences.
>
> [1] segments smaller than MSS already have PSH flag set
> by tcp_sendmsg() / tcp_mark_push(), unless MSG_MORE
> has been requested by the user.
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Cc: Soheil Hassas Yeganeh <soheil@...gle.com>
> Cc: Neal Cardwell <ncardwell@...gle.com>
> Cc: Yuchung Cheng <ycheng@...gle.com>
> Cc: Daniel Borkmann <daniel@...earbox.net>
> Cc: Tariq Toukan <tariqt@...lanox.com>
> ---
> net/ipv4/tcp_output.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 42abc9bd687a5fea627cbc7cfa750d022f393d84..fec6d67bfd146dc78f0f25173fd71b8b8cc752fe 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1050,11 +1050,22 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
> tcb = TCP_SKB_CB(skb);
> memset(&opts, 0, sizeof(opts));
>
> - if (unlikely(tcb->tcp_flags & TCPHDR_SYN))
> + if (unlikely(tcb->tcp_flags & TCPHDR_SYN)) {
> tcp_options_size = tcp_syn_options(sk, skb, &opts, &md5);
> - else
> + } else {
> tcp_options_size = tcp_established_options(sk, skb, &opts,
> &md5);
> + /* Force a PSH flag on all (GSO) packets to expedite GRO flush
> + * at receiver : This slightly improve GRO performance.
> + * Note that we do not force the PSH flag for non GSO packets,
> + * because they might be sent under high congestion events,
> + * and in this case it is better to delay the delivery of 1-MSS
> + * packets and thus the corresponding ACK packet that would
> + * release the following packet.
> + */
> + if (tcp_skb_pcount(skb) > 1)
> + tcb->tcp_flags |= TCPHDR_PSH;
> + }
> tcp_header_size = tcp_options_size + sizeof(struct tcphdr);
>
> /* if no packet is in qdisc/device queue, then allow XPS to select
Acked-by: Neal Cardwell <ncardwell@...gle.com>
Thanks, Eric!
neal
Powered by blists - more mailing lists