[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YutBc9aCQOvPPlWN@ZenIV>
Date: Thu, 4 Aug 2022 04:48:03 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: Tony Lu <tonylu@...ux.alibaba.com>
Cc: kgraul@...ux.ibm.com, kuba@...nel.org, davem@...emloft.net,
netdev@...r.kernel.org, linux-s390@...r.kernel.org,
linux-rdma@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH net-next] net/smc: Introduce TCP ULP support
On Thu, Aug 04, 2022 at 03:56:11AM +0100, Al Viro wrote:
> Half a year too late, but then it hadn't been posted on fsdevel.
> Which it really should have been, due to
>
> > + /* replace tcp socket to smc */
> > + smcsock->file = tcp->file;
> > + smcsock->file->private_data = smcsock;
> > + smcsock->file->f_inode = SOCK_INODE(smcsock); /* replace inode when sock_close */
> > + smcsock->file->f_path.dentry->d_inode = SOCK_INODE(smcsock); /* dput() in __fput */
> > + tcp->file = NULL;
>
> this. It violates a bunch of rather fundamental assertions about the
> data structures you are playing with, and I'm not even going into the
> lifetime and refcounting issues.
>
> * ->d_inode of a busy positive dentry never changes while refcount
> of dentry remains positive. A lot of places in VFS rely upon that.
> * ->f_inode of a file never changes, period.
> * ->private_data of a struct file associated with a socket never
> changes; it can be accessed lockless, with no precautions beyond "make sure
> that refcount of struct file will remain positive".
Consider, BTW, what it does to sockfd_lookup() users. We grab a reference
to struct file, pick struct socket from its ->private_data, work with that
sucker, then do sockfd_put(). Which does fput(sock->file).
Guess what happens if sockfd_lookup() is given the descriptor of your
TCP socket, just before that tcp->file = NULL? Right, fput(NULL) as
soon as matching sockfd_put() is called. And the very first thing fput()
does is this:
if (atomic_long_dec_and_test(&file->f_count)) {
And that's just one example - a *lot* of places both in VFS and in
net/* rely upon these assertions. This is really not a workable approach.
Powered by blists - more mailing lists