[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAvCjhhPYeAVzisrjWJ052USt-7LtADAYQbH6QoGyisLnWJX9g@mail.gmail.com>
Date: Thu, 19 Oct 2023 22:13:54 +0300
From: Dmitry Kravkov <dmitryk@...lt.com>
To: Shakeel Butt <shakeelb@...gle.com>
Cc: Eric Dumazet <edumazet@...gle.com>, Abel Wu <wuyun.abel@...edance.com>,
"David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Soheil Hassas Yeganeh <soheil@...gle.com>, Neal Cardwell <ncardwell@...gle.com>, netdev@...r.kernel.org,
eric.dumazet@...il.com
Subject: Re: [PATCH v2 net] net: do not leave an empty skb in write queue
On Thu, Oct 19, 2023 at 9:01 PM Shakeel Butt <shakeelb@...gle.com> wrote:
>
> +Abel Wu
>
> On Thu, Oct 19, 2023 at 4:24 AM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > Under memory stress conditions, tcp_sendmsg_locked()
> > might call sk_stream_wait_memory(), thus releasing the socket lock.
> >
> > If a fresh skb has been allocated prior to this,
> > we should not leave it in the write queue otherwise
> > tcp_write_xmit() could panic.
Eric, do you have a panic trace accidentally? Thanks
> >
> > This apparently does not happen often, but a future change
> > in __sk_mem_raise_allocated() that Shakeel and others are
> > considering would increase chances of being hurt.
> >
> > Under discussion is to remove this controversial part:
> >
> > /* Fail only if socket is _under_ its sndbuf.
> > * In this case we cannot block, so that we have to fail.
> > */
> > if (sk->sk_wmem_queued + size >= sk->sk_sndbuf) {
> > /* Force charge with __GFP_NOFAIL */
> > if (memcg_charge && !charged) {
> > mem_cgroup_charge_skmem(sk->sk_memcg, amt,
> > gfp_memcg_charge() | __GFP_NOFAIL);
> > }
> > return 1;
> > }
> >
> > Fixes: fdfc5c8594c2 ("tcp: remove empty skb from write queue in error cases")
> > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > Cc: Shakeel Butt <shakeelb@...gle.com>
>
> Reviewed-by: Shakeel Butt <shakeelb@...gle.com>
>
> > ---
> > v2: call tcp_remove_empty_skb() before tcp_push()
> >
> > net/ipv4/tcp.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index d3456cf840de35b28a6adb682e27d426b0a60f84..3d3a24f795734eecd60fc761f25f48b7a27714d4 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -927,10 +927,11 @@ int tcp_send_mss(struct sock *sk, int *size_goal, int flags)
> > return mss_now;
> > }
> >
> > -/* In some cases, both sendmsg() could have added an skb to the write queue,
> > - * but failed adding payload on it. We need to remove it to consume less
> > +/* In some cases, sendmsg() could have added an skb to the write queue,
> > + * but failed adding payload on it. We need to remove it to consume less
> > * memory, but more importantly be able to generate EPOLLOUT for Edge Trigger
> > - * epoll() users.
> > + * epoll() users. Another reason is that tcp_write_xmit() does not like
> > + * finding an empty skb in the write queue.
> > */
> > void tcp_remove_empty_skb(struct sock *sk)
> > {
> > @@ -1289,6 +1290,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
> >
> > wait_for_space:
> > set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
> > + tcp_remove_empty_skb(sk);
> > if (copied)
> > tcp_push(sk, flags & ~MSG_MORE, mss_now,
> > TCP_NAGLE_PUSH, size_goal);
> > --
> > 2.42.0.655.g421f12c284-goog
> >
>
Powered by blists - more mailing lists