lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <175314170043.2234665.2076268504915475261@noble.neil.brown.name>
Date: Tue, 22 Jul 2025 09:48:20 +1000
From: "NeilBrown" <neil@...wn.name>
To: "Amir Goldstein" <amir73il@...il.com>
Cc: "Linus Torvalds" <torvalds@...ux-foundation.org>,
 "Alexander Viro" <viro@...iv.linux.org.uk>,
 "Christian Brauner" <brauner@...nel.org>, "Jan Kara" <jack@...e.cz>,
 linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/7 RFC] New APIs for name lookup and lock for directory
 operations

On Mon, 21 Jul 2025, Amir Goldstein wrote:
> On Mon, Jul 21, 2025 at 10:46 AM NeilBrown <neil@...wn.name> wrote:
> >
> > Hi,
> >
> >  these patches (against vfs.all) primarily introduce new APIs for
> >  preparing dentries for create, remove, rename.  The goal is to
> >  centralise knowledge of how we do locking (currently by locking the
> >  directory) so that we can eventually change the mechanism (e.g.  to
> >  locking just the dentry).
> >
> >  Naming is difficult and I've changed my mind several times. :-)
> 
> Indeed it is.
> I generally like the done_ approach that you took.
> Few minor naming comments follow.
> 
> >
> >  The basic approach is to return a dentry which can be passed to
> >  vfs_create(), vfs_unlink() etc, and subsequently to release that
> >  dentry.  The closest analogue to this in the VFS is kern_path_create()
> >  which is paired with done_path_create(), though there is also
> >  kern_path_locked() which is paired with explicit inode_unlock() and
> >  dput().  So my current approach uses "done_" for finishing up.
> >
> >  I have:
> >    dentry_lookup() dentry_lookup_noperm() dentry_lookup_hashed()
> 
> As I wrote on the patch that introduces them I find dentry_lookup_hashed()
> confusing because the dentry is not hashed (only the hash is calculated).
> 
> Looking at another precedent of _noperm() vfs API we have:
> 
> vfs_setxattr()
>   __vfs_setxattr_locked()
>     __vfs_setxattr_noperm()
>       __vfs_setxattr()
> 
> Do I'd say for lack of better naming __dentry_lookup() could makes sense
> for the bare lock&dget and it could also be introduced earlier along with
> introducing done_dentry_lookup()
> 
> >    dentry_lookup_killable()
> >  paired with
> >    done_dentry_lookup()
> >
> >  and also
> >    rename_lookup() rename_lookup_noperm() rename_lookup_hashed()
> >  paired with
> >    done_rename_lookup()
> >  (these take a "struct renamedata *" to which some qstrs are added.
> >
> >  There is also "dentry_lock_in()" which is used instead of
> >  dentry_lookup() when you already have the dentry and want to lock it.
> >  So you "lock" it "in" a given parent.  I'm not very proud of this name,
> >  but I don't want to use "dentry_lock" as I want to save that for
> >  low-level locking primitives.
> 
> Very strange name :)

I wanted to encourage people to comment !!

> 
> What's wrong with dentry_lock_parent()?

That sounds like we are locking the parent, but we are conceptually locking
the dentry (Though at first we do that by locking the whole directory).

> 
> Although I think that using the verb _lock_ for locking and dget is
> actively confusing, so something along the lines of
> resume_dentry_lookup()/dentry_lookup_reacquire() might serve the
> readers of the code better.

Hmmm....  there is certainly potential there.  dentry_lookup_continue()
???

> 
> >
> >  There is also done_dentry_lookup_return() which doesn't dput() the
> >  dentry but returns it instread.  In about 1/6 of places where I need
> >  done_dentry_lookup() the code makes use of the dentry afterwards.  Only
> >  in half the places where done_dentry_lookup_return() is used is the
> >  returned value immediately returned by the calling function.  I could
> >  do a dget() before done_dentry_lookup(), but that looks awkward and I
> >  think having the _return version is justified.  I'm happy to hear other
> >  opinions.
> 
> The name is not very descriptive IMO, but I do not have a better suggestion.
> Unless you can describe it for the purpose that it is used for, e.g.
> yeild_dentry_lookup() that can be followed with resume_dentry_lookup(),
> but I do not know if those are your intentions for the return API.

The intention is that a few places (maybe a dozen) need to keep a
reference to the dentry after the operation completes.  Typically this
is a mkdir or a create.  The caller might then want to create another
file in the directory so it holds on to the dentry.
A fairly common case is that virtual filesystems hold an extra reference
to everything they create so the dcache becomes the primary store.

I suspect that in the latter case it is better to have an explicit
dget() with a comment saying "preserve in cache until explicitly
removed" or similar.
For the former case it might be better to introduce a new dentry
variable.

 dentry = dentry_lookup(....);
 dentry = vfs_mkdir(dir, dentry);
 if (!IS_ERR(dentry))
        returnval = dget(dentry);
 done_dentry_lookup(dentry)

 return returnval;

Maybe I should enhance dget() to pass through errors as well as NULL.

Maybe the above would look even better as

 struct dentry *dentry __free(lookup) = dentry_lookup(.....);
 /* handle error */
 dentry = vfs_mkdir(dir, dentry);
 return dget(dentry);

So I'll try dropping the _return version, enhancing dget, and adding a
DEFINE_FREE.

Thanks,
NeilBrown



> 
> Thanks,
> Amir.
> 
> >
> >  In order for this dentry-focussed API to work we need to have the
> >  dentry to unlock.  vfs_rmdir() currently consumes the dentry on
> >  failure, so we don't have it unless we clumsily keep a copy.  So an
> >  early patch changes vfs_rmdir() to both consume the dentry and drop the
> >  lock on failure.
> >
> >  After these new APIs are refined, agreed, and applied I will have a
> >  collection of patches to roll them out throughout the kernel.  Then we
> >  can start/continue discussing a new approach to locking which allows
> >  directory operations to proceed in parallel.
> >
> >  If you want a sneak peek at some of this future work - for context
> >  mostly - my current devel code is at https://github.com/neilbrown/linux.git
> >  in a branch "pdirops".  Be warned that a lot of the later code is under
> >  development, is known to be wrong, and doesn't even compile.  Not today
> >  anyway.  The rolling out of the new APIs is fairly mature though.
> >
> >  Please review and suggest better names, or tell me that my choices are adequate.
> >  And find the bugs in the code too :-)
> >
> >  I haven't cc:ed the maintains of the non-VFS code that the patches
> >  touch.  I can do that once the approach and names have been approved.
> >
> > Thanks,
> > NeilBrown
> >
> >
> >  [PATCH 1/7] VFS: unify old_mnt_idmap and new_mnt_idmap in renamedata
> >  [PATCH 2/7] VFS: introduce done_dentry_lookup()
> >  [PATCH 3/7] VFS: Change vfs_mkdir() to unlock on failure.
> >  [PATCH 4/7] VFS: introduce dentry_lookup() and friends
> >  [PATCH 5/7] VFS: add dentry_lookup_killable()
> >  [PATCH 6/7] VFS: add rename_lookup()
> >  [PATCH 7/7] VFS: introduce dentry_lock_in()
> >
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ