[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YvvXCWYpVCOxZeEt@ZenIV>
Date: Tue, 16 Aug 2022 18:42:33 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: Kirill Tkhai <tkhai@...ru>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
davem@...emloft.net, edumazet@...gle.com
Subject: Re: [PATCH v2 0/2] unix: Add ioctl(SIOCUNIXGRABFDS) to grab files of
receive queue skbs
On Tue, Aug 16, 2022 at 12:13:16AM +0300, Kirill Tkhai wrote:
> When a fd owning a counter of some critical resource, say, of a mount,
> it's impossible to umount that mount and disconnect related block device.
> That fd may be contained in some unix socket receive queue skb.
>
> Despite we have an interface for detecting such the sockets queues
> (/proc/[PID]/fdinfo/[fd] shows non-zero scm_fds count if so) and
> it's possible to kill that process to release the counter, the problem is
> that there may be several processes, and it's not a good thing to kill
> each of them.
>
> This patch adds a simple interface to grab files from receive queue,
> so the caller may analyze them, and even do that recursively, if grabbed
> file is unix socket itself. So, the described above problem may be solved
> by this ioctl() in pair with pidfd_getfd().
>
> Note, that the existing recvmsg(,,MSG_PEEK) is not suitable for that
> purpose, since it modifies peek offset inside socket, and this results
> in a problem in case of examined process uses peek offset itself.
> Additional ptrace freezing of that task plus ioctl(SO_PEEK_OFF) won't help
> too, since that socket may relate to several tasks, and there is no
> reliable and non-racy way to detect that. Also, if the caller of such
> trick will die, the examined task will remain frozen forever. The new
> suggested ioctl(SIOCUNIXGRABFDS) does not have such problems.
>
> The realization of ioctl(SIOCUNIXGRABFDS) is pretty simple. The only
> interesting thing is protocol with userspace. Firstly, we let userspace
> to know the number of all files in receive queue skbs. Then we receive
> fds one by one starting from requested offset. We return number of
> received fds if there is a successfully received fd, and this number
> may be less in case of error or desired fds number lack. Userspace
> may detect that situations by comparison of returned value and
> out.nr_all minus in.nr_skip. Looking over different variant this one
> looks the best for me (I considered returning error in case of error
> and there is a received fd. Also I considered returning number of
> received files as one more member in struct unix_ioc_grab_fds).
IMO it's a bad interface. Take a good look at the logics in scan_children();
TCP_LISTEN side is there for reason. And your magical ioctl won't help
with that case.
Powered by blists - more mailing lists