[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb4e66df-7639-0797-49ed-0909fb83a85a@gmail.com>
Date: Fri, 15 Oct 2021 12:59:00 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Jakub Kicinski <kuba@...nel.org>, davem@...emloft.net
Cc: netdev@...r.kernel.org
Subject: Re: [PATCH net-next] net: stream: don't purge sk_error_queue in
sk_stream_kill_queues()
On 10/15/21 6:37 AM, Jakub Kicinski wrote:
> sk_stream_kill_queues() can be called on close when there are
> still outstanding skbs to transmit. Those skbs may try to queue
> notifications to the error queue (e.g. timestamps).
> If sk_stream_kill_queues() purges the queue without taking
> its lock the queue may get corrupted, and skbs leaked.
>
> This shows up as a warning about an rmem leak:
>
> WARNING: CPU: 24 PID: 0 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x...
>
> The leak is always a multiple of 0x300 bytes (the value is in
> %rax on my builds, so RAX: 0000000000000300). 0x300 is truesize of
> an empty sk_buff. Indeed if we dump the socket state at the time
> of the warning the sk_error_queue is often (but not always)
> corrupted. The ->next pointer points back at the list head,
> but not the ->prev pointer. Indeed we can find the leaked skb
> by scanning the kernel memory for something that looks like
> an skb with ->sk = socket in question, and ->truesize = 0x300.
> The contents of ->cb[] of the skb confirms the suspicion that
> it is indeed a timestamp notification (as generated in
> __skb_complete_tx_timestamp()).
>
> Removing purging of sk_error_queue should be okay, since
> inet_sock_destruct() does it again once all socket refs
> are gone. Eric suggests this may cause sockets that go
> thru disconnect() to maintain notifications from the
> previous incarnations of the socket, but that should be
> okay since the race was there anyway, and disconnect()
> is not exactly dependable.
>
> Thanks to Jonathan Lemon and Omar Sandoval for help at various
> stages of tracing the issue.
>
> Fixes: cb9eff097831 ("net: new user space API for time stamping of incoming and outgoing packets")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> ---
> v1: delete the purge completely
>
> Sorry for the delay from RFC, took a while to get enough
> production signal to confirm the fix.
> ---
> net/core/stream.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/net/core/stream.c b/net/core/stream.c
> index e09ffd410685..06b36c730ce8 100644
> --- a/net/core/stream.c
> +++ b/net/core/stream.c
> @@ -195,9 +195,6 @@ void sk_stream_kill_queues(struct sock *sk)
> /* First the read buffer. */
> __skb_queue_purge(&sk->sk_receive_queue);
>
> - /* Next, the error queue. */
> - __skb_queue_purge(&sk->sk_error_queue);
> -
> /* Next, the write queue. */
> WARN_ON(!skb_queue_empty(&sk->sk_write_queue));
>
>
Thanks Jakub !
Reviewed-by: Eric Dumazet <edumazet@...gle.com>
Powered by blists - more mailing lists