[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <175507128953.2234665.9075244835979746809@noble.neil.brown.name>
Date: Wed, 13 Aug 2025 17:48:09 +1000
From: "NeilBrown" <neil@...wn.name>
To: "Al Viro" <viro@...iv.linux.org.uk>
Cc: "Christian Brauner" <brauner@...nel.org>, "Jan Kara" <jack@...e.cz>,
"David Howells" <dhowells@...hat.com>,
"Marc Dionne" <marc.dionne@...istor.com>, "Xiubo Li" <xiubli@...hat.com>,
"Ilya Dryomov" <idryomov@...il.com>, "Tyler Hicks" <code@...icks.com>,
"Miklos Szeredi" <miklos@...redi.hu>, "Richard Weinberger" <richard@....at>,
"Anton Ivanov" <anton.ivanov@...bridgegreys.com>,
"Johannes Berg" <johannes@...solutions.net>,
"Trond Myklebust" <trondmy@...nel.org>, "Anna Schumaker" <anna@...nel.org>,
"Chuck Lever" <chuck.lever@...cle.com>, "Jeff Layton" <jlayton@...nel.org>,
"Amir Goldstein" <amir73il@...il.com>, "Steve French" <sfrench@...ba.org>,
"Namjae Jeon" <linkinjeon@...nel.org>, "Carlos Maiolino" <cem@...nel.org>,
linux-fsdevel@...r.kernel.org, linux-afs@...ts.infradead.org,
netfs@...ts.linux.dev, ceph-devel@...r.kernel.org, ecryptfs@...r.kernel.org,
linux-um@...ts.infradead.org, linux-nfs@...r.kernel.org,
linux-unionfs@...r.kernel.org, linux-cifs@...r.kernel.org,
linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 02/11] VFS: introduce dentry_lookup() and friends
On Wed, 13 Aug 2025, Al Viro wrote:
> On Tue, Aug 12, 2025 at 12:25:05PM +1000, NeilBrown wrote:
> > This patch is the first step in introducing a new API for locked
> > operation on names in directories. It supports operations that create or
> > remove names. Rename operations will also be part of this new API but
> > require different specific interfaces.
> >
> > The plan is to lock just the dentry (or dentries), not the whole
> > directory. dentry_lookup() combines locking the directory and
> > performing a lookup prior to a change to the directory. On success it
> > returns a dentry which is consider to be locked, though at this stage
> > the whole parent directory is actually locked.
> >
> > dentry_lookup_noperm() does the same without needing a mnt_idmap and
> > without checking permissions. This is useful for internal filesystem
> > management (e.g. creating virtual files in response to events) and in
> > other cases similar to lookup_noperm().
>
> Details, please. I seriously hope that simple_start_creating() will
> end up used for all of those; your variant allows passing LOOKUP_...
> flags and I would like to understand what the usecases will be.
simple_start_creating() would meet a lot of needs.
A corresponding simple_start_deleting() would suit
cachefiles_lookup_for_cull(), fuse_reverse_inval_entry(),
nfsd4_unlink_clid_dir() etc.
btrfs_ioctl_snap_destroy() would want a simple_start_deleting() but also
wants killable.
cachefiles_get_directory() wants a simple_start_creating() without the
LOOKUP_EXCL so that if is already exists, it can go ahead and use the
dentry without creating.
cachefiles_commit_tmpfile() has a similar need - if it exists it will
unlink and repeat the lookup. Once it doesn't it it will be target of
vfs_link()
nfsd3_create_file() wants simple_start_creating without LOOKUP_EXCL.
as do a few other nfsd functions.
nfsd4_list_rec_dir() effectively wants simple_start_deleting() (i.e.
fail if it doesn't exist) but sometimes it won't delete, it will do
something else.
All calls pass one of:
0
LOOKUP_CREATE
LOOKUP_CREATE | LOOKUP_EXCL
The first two aren't reliably called for any particular task so a
"simple_start_XXX" sort of name doesn't seem appropriate.
>
> What's more, IME the "intent" arguments are best avoided - better have
> separate primitives; if internally they call a common helper with some
> flags, etc., it's their business, but exposing that to callers ends
> up with very unpleasant audits down the road. As soon as you get
> callers that pass something other than explicit constants, you get
> data flow into the mix ("which values can end up passed in this one?")
> and that's asking for trouble.
lookup_no_create, lookup_may_create, lookup_must_create ???
Either as function names, or as an enum to pass to the function?
If we had separate functions we would need _noperm and potentially
_killable versions of each. Fortunately there is no current need for
_noperm_killable. Maybe that combinatorial explosion isn't too bad.
>
> > __dentry_lookup() is a VFS-internal interface which does no permissions
> > checking and assumes that the hash of the name has already been stored
> > in the qstr. This is useful following filename_parentat().
> >
> > done_dentry_lookup() is provided which performs the inverse of putting
> > the dentry and unlocking.
>
> Not sure I like the name, TBH...
I'm open to suggestions for alternatives.
>
> > Like lookup_one_qstr_excl(), dentry_lookup() returns -ENOENT if
> > LOOKUP_CREATE was NOT given and the name cannot be found,, and returns
> > -EEXIST if LOOKUP_EXCL WAS given and the name CAN be found.
> >
> > These functions replace all uses of lookup_one_qstr_excl() in namei.c
> > except for those used for rename.
> >
> > They also allow simple_start_creating() to be simplified into a
> > static-inline.
>
> Umm... You've also moved it into linux/namei.h; we'd better verify that
> it's included in all places that need that...
I added includes where necessary.
>
> > A __free() class is provided to allow done_dentry_lookup() to be called
> > transparently on scope exit. dget() is extended to ignore ERR_PTR()s
> > so that "return dget(dentry);" is always safe when dentry was provided
> > by dentry_lookup() and the variable was declared __free(dentry_lookup).
>
> Please separate RAII stuff from the rest of that commit. Deciding if
> it's worth doing in any given case is hard to do upfront.
I'd rather not - it does make a few changes much nicer. But I can if it
is necessary.
>
> > lookup_noperm_common() and lookup_one_common() are moved earlier in
> > namei.c.
>
> Again, separate commit - reduce the noise in less trivial ones.
>
> > +struct dentry *dentry_lookup(struct mnt_idmap *idmap,
> > + struct qstr *last,
> > + struct dentry *base,
> > + unsigned int lookup_flags)
>
> Same problem with flags, *ESPECIALLY* if your endgame involves the
> locking of result dependent upon those.
Locking the result happens precisely if a non-error is returned. The
lookup flags indicate which circumstances result in errors.
>
> > - dput(dentry);
> > + done_dentry_lookup(dentry);
> > dentry = ERR_PTR(error);
> > -unlock:
> > - inode_unlock(path->dentry->d_inode);
>
> Incidentally, this combination (dput()+unlock+return ERR_PTR())
> is common enough. Might be worth a helper (taking error as argument;
> that's one of the reasons why I'm not sure RAII is a good fit for this
> problem space)
I found RAII worked quite well in several places and very well in a few.
I think the main reason I had for *not* using RAII is that you really
need to use it for everything and I didn't want to change code too much.
>
> > +/* no_free_ptr() must not be used here - use dget() */
> > +DEFINE_FREE(dentry_lookup, struct dentry *, if (_T) done_dentry_lookup(_T))
>
> UGH. Please, take that to the very end of the series. And the comment
> re no_free_ptr() and dget() is really insufficient - you will get a dentry
> reference that outlives your destructor, except that locking environment will
> change. Which is subtle enough to warrant a discussion.
>
> Besides, I'm not fond of mixing that with dget(); put another way, this
> subset of dget() uses is worth being clearly marked as such. A primitive
> that calls dget() to do work? Sure, no problem.
>
> We have too many dget() callers with very little indication of what we are
> doing there (besides "bump the refcount"); tree-in-dcache series will
> at least peel off the ones that are about pinning a created object in
> ramfs-style filesystems. That's not going to cover everything (not even
> close), but let's at least not add to the already messy pile...
>
Thanks for the review,
NeilBrown
Powered by blists - more mailing lists