[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20200914.165840.1091897815096752872.davem@davemloft.net>
Date: Mon, 14 Sep 2020 16:58:40 -0700 (PDT)
From: David Miller <davem@...emloft.net>
To: soheil.kdev@...il.com
Cc: netdev@...r.kernel.org, edumazet@...gle.com, soheil@...gle.com
Subject: Re: [PATCH net-next 2/2] tcp: schedule EPOLLOUT after a partial
sendmsg
From: Soheil Hassas Yeganeh <soheil.kdev@...il.com>
Date: Mon, 14 Sep 2020 17:52:10 -0400
> From: Soheil Hassas Yeganeh <soheil@...gle.com>
>
> For EPOLLET, applications must call sendmsg until they get EAGAIN.
> Otherwise, there is no guarantee that EPOLLOUT is sent if there was
> a failure upon memory allocation.
>
> As a result on high-speed NICs, userspace observes multiple small
> sendmsgs after a partial sendmsg until EAGAIN, since TCP can send
> 1-2 TSOs in between two sendmsg syscalls:
>
> // One large partial send due to memory allocation failure.
> sendmsg(20MB) = 2MB
> // Many small sends until EAGAIN.
> sendmsg(18MB) = 64KB
> sendmsg(17.9MB) = 128KB
> sendmsg(17.8MB) = 64KB
> ...
> sendmsg(...) = EAGAIN
> // At this point, userspace can assume an EPOLLOUT.
>
> To fix this, set the SOCK_NOSPACE on all partial sendmsg scenarios
> to guarantee that we send EPOLLOUT after partial sendmsg.
>
> After this commit userspace can assume that it will receive an EPOLLOUT
> after the first partial sendmsg. This EPOLLOUT will benefit from
> sk_stream_write_space() logic delaying the EPOLLOUT until significant
> space is available in write queue.
>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Signed-off-by: Soheil Hassas Yeganeh <soheil@...gle.com>
Applied.
Powered by blists - more mailing lists