[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231123182426.GO38156@ZenIV>
Date: Thu, 23 Nov 2023 18:24:26 +0000
From: Al Viro <viro@...iv.linux.org.uk>
To: Gabriel Krisman Bertazi <gabriel@...sman.be>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Christian Brauner <brauner@...nel.org>, tytso@....edu,
linux-f2fs-devel@...ts.sourceforge.net, ebiggers@...nel.org,
linux-fsdevel@...r.kernel.org, jaegeuk@...nel.org,
linux-ext4@...r.kernel.org
Subject: Re: [f2fs-dev] [PATCH v6 0/9] Support negative dentries on
case-insensitive ext4 and f2fs
On Thu, Nov 23, 2023 at 12:37:43PM -0500, Gabriel Krisman Bertazi wrote:
> > That's the problem I'd been talking about - there is a class of situations
> > where the work done by ext4_lookup() to set the state of dentry gets
> > completely lost. After lookup you do have a dentry in the right place,
> > with the right name and inode, etc., but with NULL
> > ->d_op->d_revalidate.
>
> I get the problem now. I admit to not understanding all the details yet,
> which is why I haven't answered directly, but I understand already how
> it can get borked. I'm studying your explanation.
>
> Originally, ->d_op could be propagated trivially since we had sb->s_d_op
> set, which would be set by __d_alloc, but that is no longer the case
> since we combined fscrypt and CI support.
>
> What I still don't understand is why we shouldn't fixup ->d_op when
> calling d_obtain_alias (before __d_instantiate_anon) and you say we
> better do it in d_splice_alias. The ->d_op is going to be the same
> across the filesystem when the casefold feature is enabled, regardless
> if the directory is casefolded. If we set it there, the alias already
> has the right d_op from the start.
*blink*
A paragraph above you've said that it's not constant over the entire
filesystem.
Look, it's really simple - any setup work of that sort done in ->lookup()
is either misplaced, or should be somehow transferred over to the alias
if one gets picked.
As for d_obtain_alias()... AFAICS, it's far more limited in what information
it could access. It knows the inode, but it has no idea about the parent
to be.
The more I look at that, the more it feels like we need a method that would
tell the filesystem that this dentry is about to be spliced here. 9p is
another place where it would obviously simplify the things; ocfs2 'attach
lock' stuff is another case where the things get much more complicated
by having to do that stuff after splicing, etc.
It's not even hard to do:
1. turn bool exchange in __d_move() arguments into 3-value thing - move,
exchange or splice. Have the callers in d_splice_alias() and __d_unalias()
pass "splice" instead of false (aka normal move).
2. make __d_move() return an int (normally 0)
3. if asked to splice and if there's target->d_op->d_transfer(), let
__d_move() call it right after
spin_lock_nested(&dentry->d_lock, 2);
spin_lock_nested(&target->d_lock, 3);
in there. Passing it target and dentry, obviously. In unlikely case
of getting a non-zero returned by the method, undo locks and return
that value to __d_move() caller.
4. d_move() and d_exchange() would ignore the value returned by __d_move();
__d_unalias() turn
__d_move(alias, dentry, false);
ret = 0;
into
ret = __d_move(alias, dentry, Splice);
d_splice_alias() turn
__d_move(new, dentry, false);
write_sequnlock(&rename_lock);
into
err = __d_move(new, dentry, Splice);
write_sequnlock(&rename_lock);
if (unlikely(err)) {
dput(new);
new = ERR_PTR(err);
}
(actually, dput()-on-error part would be common to all 3 branches
in there, so it would probably get pulled out of that if-else if-else).
I can cook a patch doing that (and convert the obvious beneficiaries already
in the tree to it) and throw it into dcache branch - just need to massage
the series in there for repost...
PS: note, BTW, that fscrypt folks have already placed a hook into
__d_move(), exactly for the case of splice; I wonder if that would be
foldable into the same mechanism - hadn't looked in details yet.
Powered by blists - more mailing lists