[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87v89kio36.fsf@email.froward.int.ebiederm.org>
Date: Wed, 29 Nov 2023 09:19:41 -0600
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: linux-fsdevel@...r.kernel.org, Linus Torvalds
<torvalds@...ux-foundation.org>, Christian Brauner <brauner@...nel.org>,
tytso@....edu, linux-f2fs-devel@...ts.sourceforge.net,
ebiggers@...nel.org, jaegeuk@...nel.org, linux-ext4@...r.kernel.org,
Miklos Szeredi <miklos@...redi.hu>, Gabriel Krisman Bertazi
<gabriel@...sman.be>
Subject: Re: fun with d_invalidate() vs. d_splice_alias() was Re: [f2fs-dev]
[PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs
Al Viro <viro@...iv.linux.org.uk> writes:
> On Mon, Nov 27, 2023 at 06:38:43AM +0000, Al Viro wrote:
>
>> > FWIW, I suspect that the right answer would be along the lines of
>> > * if d_splice_alias() does move an exsiting (attached) alias in
>> > place, it ought to dissolve all mountpoints in subtree being moved.
>> > There might be subtleties,
>
> Are there ever... Starting with the "our test for loop creation
> (alias is a direct ancestor, need to fail with -ELOOP) is dependent
> upon rename_lock being held all along".
>
> Folks, what semantics do we want for dissolving mounts on splice?
> The situation when it happens is when we have a subtree on e.g. NFS
> and have some mounts (on client) inside that. Then somebody on
> server moves the root of that subtree somewhere else and we try
> to do a lookup in new place. Options:
>
> 1) our dentry for directory that got moved on server is moved into
> new place, along with the entire subtree *and* everything mounted
> on it. Very dubious semantics, especially since if we look the
> old location up before looking for new one, the mounts will be
> dissolved; no way around that.
>
> 2) lookup fails. It's already possible; e.g. if server has
> /srv/nfs/1/2/3 moved to /srv/nfs/x, then /srv/nfs/1/2 moved
> to /srv/nfs/x/y and client has a process with cwd in /mnt/nfs/1/2/3
> doing a lookup for "y", there's no way in hell to handle that -
> the lookup will return the fhandle of /srv/nfs/x, which is the
> same thing the client has for /mnt/nfs/1/2; we *can't* move that
> dentry to /mnt/nfs/1/2/3/y - not without creating a detached loop.
> We can also run into -ESTALE if one of the trylocks in
> __d_unalias() fails. Having the same happen if there are mounts
> in the subtree we are trying to splice would be unpleasant, but
> not fatal. The trouble is, that won't be a transient failure -
> not until somebody tries to look the old location up.
>
> 3) dissolve the mounts. Doable, but it's not easy; especially
> since we end up having to redo the loop-prevention check after
> the mounts had been dissolved. And that check may be failing
> by that time, with no way to undo that dissolving...
To be clear this is a change in current semantics and has a minuscule
change of resulting in a regression. That should be called out in the
change log.
If we choose to change the semantics I would suggest that the new
semantics be:
If a different name for a directory already exists:
* Detach the mounts unconditionally (leaving dentry descendants alone).
* Attempt the current splice.
- If the splice succeeds ( return the new dentry )
- If the splice fails ( fail the lookup, and d_invalidate the existing name )
Unconditionally dissolving the mounts before attempting the rename
should simplify everything.
In the worst case a race between d_invalidate and d_splice_alias will
now become a race to see who can detach the mounts first.
Eric
Powered by blists - more mailing lists