[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b9ab6b03-664c-eb81-0fbd-6f696276d9aa@akamai.com>
Date: Sat, 17 Aug 2019 10:19:51 -0400
From: Jason Baron <jbaron@...mai.com>
To: Eric Dumazet <edumazet@...gle.com>,
"David S . Miller" <davem@...emloft.net>
Cc: netdev <netdev@...r.kernel.org>,
Soheil Hassas Yeganeh <soheil@...gle.com>,
Neal Cardwell <ncardwell@...gle.com>,
Eric Dumazet <eric.dumazet@...il.com>,
Vladimir Rutsky <rutsky@...gle.com>
Subject: Re: [PATCH net] tcp: make sure EPOLLOUT wont be missed
On 8/17/19 12:26 AM, Eric Dumazet wrote:
> As Jason Baron explained in commit 790ba4566c1a ("tcp: set SOCK_NOSPACE
> under memory pressure"), it is crucial we properly set SOCK_NOSPACE
> when needed.
>
> However, Jason patch had a bug, because the 'nonblocking' status
> as far as sk_stream_wait_memory() is concerned is governed
> by MSG_DONTWAIT flag passed at sendmsg() time :
>
> long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
>
> So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(),
> and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE
> cleared, if sk->sk_sndtimeo has been set to a small (but not zero)
> value.
Is MSG_DONTWAIT not set in this case? The original patch was intended
only for the explicit non-blocking case. The epoll manpage says:
"EPOLLET flag should use nonblocking file descriptors". So the original
intention was not to impact the blocking case. This seems to me like
a different use-case.
Thanks,
-Jason
> This patch removes the 'noblock' variable since we must always
> set SOCK_NOSPACE if -EAGAIN is returned.
>
> It also renames the do_nonblock label since we might reach this
> code path even if we were in blocking mode.
>
> Fixes: 790ba4566c1a ("tcp: set SOCK_NOSPACE under memory pressure")
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> Cc: Jason Baron <jbaron@...mai.com>
> Reported-by: Vladimir Rutsky <rutsky@...gle.com>
> ---
> net/core/stream.c | 16 +++++++++-------
> 1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/net/core/stream.c b/net/core/stream.c
> index e94bb02a56295ec2db34ab423a8c7c890df0a696..4f1d4aa5fb38d989a9c81f32dfce3f31bbc1fa47 100644
> --- a/net/core/stream.c
> +++ b/net/core/stream.c
> @@ -120,7 +120,6 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
> int err = 0;
> long vm_wait = 0;
> long current_timeo = *timeo_p;
> - bool noblock = (*timeo_p ? false : true);
> DEFINE_WAIT_FUNC(wait, woken_wake_function);
>
> if (sk_stream_memory_free(sk))
> @@ -133,11 +132,8 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
>
> if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
> goto do_error;
> - if (!*timeo_p) {
> - if (noblock)
> - set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> - goto do_nonblock;
> - }
> + if (!*timeo_p)
> + goto do_eagain;
> if (signal_pending(current))
> goto do_interrupted;
> sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
> @@ -169,7 +165,13 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
> do_error:
> err = -EPIPE;
> goto out;
> -do_nonblock:
> +do_eagain:
> + /* Make sure that whenever EAGAIN is returned, EPOLLOUT event can
> + * be generated later.
> + * When TCP receives ACK packets that make room, tcp_check_space()
> + * only calls tcp_new_space() if SOCK_NOSPACE is set.
> + */
> + set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> err = -EAGAIN;
> goto out;
> do_interrupted:
>
Powered by blists - more mailing lists