lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <175507128953.2234665.9075244835979746809@noble.neil.brown.name>
Date: Wed, 13 Aug 2025 17:48:09 +1000
From: "NeilBrown" <neil@...wn.name>
To: "Al Viro" <viro@...iv.linux.org.uk>
Cc: "Christian Brauner" <brauner@...nel.org>, "Jan Kara" <jack@...e.cz>,
 "David Howells" <dhowells@...hat.com>,
 "Marc Dionne" <marc.dionne@...istor.com>, "Xiubo Li" <xiubli@...hat.com>,
 "Ilya Dryomov" <idryomov@...il.com>, "Tyler Hicks" <code@...icks.com>,
 "Miklos Szeredi" <miklos@...redi.hu>, "Richard Weinberger" <richard@....at>,
 "Anton Ivanov" <anton.ivanov@...bridgegreys.com>,
 "Johannes Berg" <johannes@...solutions.net>,
 "Trond Myklebust" <trondmy@...nel.org>, "Anna Schumaker" <anna@...nel.org>,
 "Chuck Lever" <chuck.lever@...cle.com>, "Jeff Layton" <jlayton@...nel.org>,
 "Amir Goldstein" <amir73il@...il.com>, "Steve French" <sfrench@...ba.org>,
 "Namjae Jeon" <linkinjeon@...nel.org>, "Carlos Maiolino" <cem@...nel.org>,
 linux-fsdevel@...r.kernel.org, linux-afs@...ts.infradead.org,
 netfs@...ts.linux.dev, ceph-devel@...r.kernel.org, ecryptfs@...r.kernel.org,
 linux-um@...ts.infradead.org, linux-nfs@...r.kernel.org,
 linux-unionfs@...r.kernel.org, linux-cifs@...r.kernel.org,
 linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 02/11] VFS: introduce dentry_lookup() and friends

On Wed, 13 Aug 2025, Al Viro wrote:
> On Tue, Aug 12, 2025 at 12:25:05PM +1000, NeilBrown wrote:
> > This patch is the first step in introducing a new API for locked
> > operation on names in directories.  It supports operations that create or
> > remove names.  Rename operations will also be part of this new API but
> > require different specific interfaces.
> > 
> > The plan is to lock just the dentry (or dentries), not the whole
> > directory.  dentry_lookup() combines locking the directory and
> > performing a lookup prior to a change to the directory.  On success it
> > returns a dentry which is consider to be locked, though at this stage
> > the whole parent directory is actually locked.
> > 
> > dentry_lookup_noperm() does the same without needing a mnt_idmap and
> > without checking permissions.  This is useful for internal filesystem
> > management (e.g.  creating virtual files in response to events) and in
> > other cases similar to lookup_noperm().
> 
> Details, please.  I seriously hope that simple_start_creating() will
> end up used for all of those; your variant allows passing LOOKUP_...
> flags and I would like to understand what the usecases will be.

simple_start_creating() would meet a lot of needs.
A corresponding simple_start_deleting() would suit
cachefiles_lookup_for_cull(), fuse_reverse_inval_entry(),
nfsd4_unlink_clid_dir() etc.

btrfs_ioctl_snap_destroy() would want a simple_start_deleting() but also
wants killable.

cachefiles_get_directory() wants a simple_start_creating() without the
LOOKUP_EXCL so that if is already exists, it can go ahead and use the
dentry without creating.

cachefiles_commit_tmpfile() has a similar need - if it exists it will
unlink and repeat the lookup.  Once it doesn't it it will be target of
vfs_link()

nfsd3_create_file() wants simple_start_creating without LOOKUP_EXCL.
as do a few other nfsd functions.

nfsd4_list_rec_dir() effectively wants simple_start_deleting() (i.e.
fail if it doesn't exist) but sometimes it won't delete, it will do
something else.

All calls pass one of:
   0
   LOOKUP_CREATE
   LOOKUP_CREATE | LOOKUP_EXCL

The first two aren't reliably called for any particular task so a
"simple_start_XXX" sort of name doesn't seem appropriate.


> 
> What's more, IME the "intent" arguments are best avoided - better have
> separate primitives; if internally they call a common helper with some
> flags, etc., it's their business, but exposing that to callers ends
> up with very unpleasant audits down the road.  As soon as you get
> callers that pass something other than explicit constants, you get
> data flow into the mix ("which values can end up passed in this one?")
> and that's asking for trouble.

lookup_no_create, lookup_may_create, lookup_must_create ???

Either as function names, or as an enum to pass to the function?

If we had separate functions we would need _noperm and potentially
_killable versions of each.  Fortunately there is no current need for
_noperm_killable.  Maybe that combinatorial explosion isn't too bad.

> 
> > __dentry_lookup() is a VFS-internal interface which does no permissions
> > checking and assumes that the hash of the name has already been stored
> > in the qstr.  This is useful following filename_parentat().
> > 
> > done_dentry_lookup() is provided which performs the inverse of putting
> > the dentry and unlocking.
> 
> Not sure I like the name, TBH...

I'm open to suggestions for alternatives.

> 
> > Like lookup_one_qstr_excl(), dentry_lookup() returns -ENOENT if
> > LOOKUP_CREATE was NOT given and the name cannot be found,, and returns
> > -EEXIST if LOOKUP_EXCL WAS given and the name CAN be found.
> > 
> > These functions replace all uses of lookup_one_qstr_excl() in namei.c
> > except for those used for rename.
> > 
> > They also allow simple_start_creating() to be simplified into a
> > static-inline.
> 
> Umm...  You've also moved it into linux/namei.h; we'd better verify that
> it's included in all places that need that...

I added includes where necessary.

> 
> > A __free() class is provided to allow done_dentry_lookup() to be called
> > transparently on scope exit.  dget() is extended to ignore ERR_PTR()s
> > so that "return dget(dentry);" is always safe when dentry was provided
> > by dentry_lookup() and the variable was declared __free(dentry_lookup).
> 
> Please separate RAII stuff from the rest of that commit.  Deciding if
> it's worth doing in any given case is hard to do upfront.

I'd rather not - it does make a few changes much nicer.  But I can if it
is necessary.

> 
> > lookup_noperm_common() and lookup_one_common() are moved earlier in
> > namei.c.
> 
> Again, separate commit - reduce the noise in less trivial ones.
> 
> > +struct dentry *dentry_lookup(struct mnt_idmap *idmap,
> > +			     struct qstr *last,
> > +			     struct dentry *base,
> > +			     unsigned int lookup_flags)
> 
> Same problem with flags, *ESPECIALLY* if your endgame involves the
> locking of result dependent upon those.

Locking the result happens precisely if a non-error is returned.  The
lookup flags indicate which circumstances result in errors.

> 
> > -	dput(dentry);
> > +	done_dentry_lookup(dentry);
> >  	dentry = ERR_PTR(error);
> > -unlock:
> > -	inode_unlock(path->dentry->d_inode);
> 
> Incidentally, this combination (dput()+unlock+return ERR_PTR())
> is common enough.  Might be worth a helper (taking error as argument;
> that's one of the reasons why I'm not sure RAII is a good fit for this
> problem space)

I found RAII worked quite well in several places and very well in a few.
I think the main reason I had for *not* using RAII is that you really
need to use it for everything and I didn't want to change code too much.


> 
> > +/* no_free_ptr() must not be used here - use dget() */
> > +DEFINE_FREE(dentry_lookup, struct dentry *, if (_T) done_dentry_lookup(_T))
> 
> UGH.  Please, take that to the very end of the series.  And the comment
> re no_free_ptr() and dget() is really insufficient - you will get a dentry
> reference that outlives your destructor, except that locking environment will
> change.  Which is subtle enough to warrant a discussion.
> 
> Besides, I'm not fond of mixing that with dget(); put another way, this
> subset of dget() uses is worth being clearly marked as such.  A primitive
> that calls dget() to do work?  Sure, no problem.
> 
> We have too many dget() callers with very little indication of what we are
> doing there (besides "bump the refcount"); tree-in-dcache series will
> at least peel off the ones that are about pinning a created object in
> ramfs-style filesystems.  That's not going to cover everything (not even
> close), but let's at least not add to the already messy pile...
> 

Thanks for the review,
NeilBrown

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ