[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SJ0PR84MB184707E40732357494D0EC17B2259@SJ0PR84MB1847.NAMPRD84.PROD.OUTLOOK.COM>
Date: Thu, 13 Oct 2022 14:44:02 +0000
From: "Arankal, Nagaraj" <nagaraj.p.arankal@....com>
To: Andrew Lunn <andrew@...n.ch>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: socket leaks observed in Linux kernel's passive close path
Hi Andrew,
Thanks for looking into this, I have not tested this on V6.0 kernel, and as far as I know I have not observed any fixes in this area, that's why I posted this, as this seems to be a valid case.
Thanks,
Nagaraj P Arankal
-----Original Message-----
From: Andrew Lunn <andrew@...n.ch>
Sent: Thursday, October 13, 2022 7:50 PM
To: Arankal, Nagaraj <nagaraj.p.arankal@....com>
Cc: netdev@...r.kernel.org
Subject: Re: socket leaks observed in Linux kernel's passive close path
On Thu, Oct 13, 2022 at 06:47:56AM +0000, Arankal, Nagaraj wrote:
> Description:
> We have observed a strange race condition , where sockets are not freed in kernel in the following condition.
> We have a kernel module , which monitors the TCP connection state changes , as part of the functionality it replaces the default sk_destruct function of all TCP sockets with our module specific routine. Looks like sk_destruct() is not invoked in following condition and hence the sockets are leaked despite receiving RESET from the remote.
>
> 1. Establish a TCP connection between Host A and Host B.
> 2. Make the client at B to initiate the CLOSE() immediately after 3-way handshake.
> 3. Server end sends huge amount of data to client and does close on FD.
> 4. FIN from the client is not ACKED, and server is busy sending the data.
> 5. RESET is received from the remote client.
> 6. Sk_destruct() is not invoked due to non-null sk_refcnt or sk_wmem_alloc count.
>
> Kernel version: Debian Linux 4.19.y(238,247)
Is this reproducible with a modern kernel? v6.0? If this is already fixed, we need to identify what change fixed it, and get it back ported. If it is broken in v6.0, and net-next, it first needs fixing in net-next, and then back porting to the different LTS kernels.
Andrew
Powered by blists - more mailing lists