[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <177034031005.16766.246184445940612287@noble.neil.brown.name>
Date: Fri, 06 Feb 2026 12:11:50 +1100
From: NeilBrown <neilb@...mail.net>
To: "Amir Goldstein" <amir73il@...il.com>
Cc: "Christian Brauner" <brauner@...nel.org>,
"Alexander Viro" <viro@...iv.linux.org.uk>,
"David Howells" <dhowells@...hat.com>, "Jan Kara" <jack@...e.cz>,
"Chuck Lever" <chuck.lever@...cle.com>, "Jeff Layton" <jlayton@...nel.org>,
"Miklos Szeredi" <miklos@...redi.hu>,
"John Johansen" <john.johansen@...onical.com>,
"Paul Moore" <paul@...l-moore.com>, "James Morris" <jmorris@...ei.org>,
"Serge E. Hallyn" <serge@...lyn.com>,
"Stephen Smalley" <stephen.smalley.work@...il.com>,
linux-kernel@...r.kernel.org, netfs@...ts.linux.dev,
linux-fsdevel@...r.kernel.org, linux-nfs@...r.kernel.org,
linux-unionfs@...r.kernel.org, apparmor@...ts.ubuntu.com,
linux-security-module@...r.kernel.org, selinux@...r.kernel.org
Subject: Re: [PATCH 10/13] ovl: change ovl_create_real() to get a new lock
when re-opening created file.
On Thu, 05 Feb 2026, Amir Goldstein wrote:
> On Wed, Feb 4, 2026 at 6:09 AM NeilBrown <neilb@...mail.net> wrote:
> >
> > From: NeilBrown <neil@...wn.name>
> >
> > When ovl_create_real() is used to create a file on the upper filesystem
> > it needs to return the resulting dentry - positive and hashed.
> > It is usually the case the that dentry passed to the create function
> > (e.g. vfs_create()) will be suitable but this is not guaranteed. The
> > filesystem may unhash that dentry forcing a repeat lookup next time the
> > name is wanted.
> >
> > So ovl_create_real() must be (and is) aware of this and prepared to
> > perform that lookup to get a hash positive dentry.
> >
> > This is currently done under that same directory lock that provided
> > exclusion for the create. Proposed changes to locking will make this
> > not possible - as the name, rather than the directory, will be locked.
> > The new APIs provided for lookup and locking do not and cannot support
> > this pattern.
> >
> > The lock isn't needed. ovl_create_real() can drop the lock and then get
> > a new lock for the lookup - then check that the lookup returned the
> > correct inode. In a well-behaved configuration where the upper
> > filesystem is not being modified by a third party, this will always work
> > reliably, and if there are separate modification it will fail cleanly.
> >
> > So change ovl_create_real() to drop the lock and call
> > ovl_start_creating_upper() to find the correct dentry. Note that
> > start_creating doesn't fail if the name already exists.
> >
> > This removes the only remaining use of ovl_lookup_upper, so it is
> > removed.
> >
> > Signed-off-by: NeilBrown <neil@...wn.name>
> > ---
> > fs/overlayfs/dir.c | 24 ++++++++++++++++++------
> > fs/overlayfs/overlayfs.h | 7 -------
> > 2 files changed, 18 insertions(+), 13 deletions(-)
> >
> > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
> > index ff3dbd1ca61f..ec08904d084d 100644
> > --- a/fs/overlayfs/dir.c
> > +++ b/fs/overlayfs/dir.c
> > @@ -219,21 +219,33 @@ struct dentry *ovl_create_real(struct ovl_fs *ofs, struct dentry *parent,
> > err = -EIO;
> > } else if (d_unhashed(newdentry)) {
> > struct dentry *d;
> > + struct name_snapshot name;
> > /*
> > * Some filesystems (i.e. casefolded) may return an unhashed
> > - * negative dentry from the ovl_lookup_upper() call before
> > + * negative dentry from the ovl_start_creating_upper() call before
> > * ovl_create_real().
>
>
> According to the new locking rules, if the hashed dentry itself is
> the synchronization object, is it going to be allowed to
> filesystem to unhash the dentry while the dentry still in the
> "creating" scope? It is hard for me to wrap my head around this.
It can be confusing....
It will be important for the name the remain locked (and hashed) until
the operation (create, remove, rename) either succeeds or fails. So
leaving a dentry unhashed will be OK providing a subsequent lookup will
also succeed or fail in the same way. The caller must be able to use
the dentry to access the object (i.e. the inode) on success, but they
is nothing in POSIX that requires that the object still has any
particular name.
>
> Or do we need this here because some filesystems (casefold in
> particular) are not going to support parallel creations?
There is no reason that a casefolding filesystem would not support parallel
ops. And it isn't just casefolding that acts like this. At least one of
the special filesystems (tracefs maybe) always unhashes on create. You
only ever get a hashed positive dentry as a result of lookup.
(overlayfs would never see this case of course).
>
> > * In that case, lookup again after making the newdentry
> > * positive, so ovl_create_upper() always returns a hashed
> > * positive dentry.
> > + * As we have to drop the lock before the lookup a race
> > + * could result in a lookup failure. In that case we return
> > + * an error.
> > */
> > - d = ovl_lookup_upper(ofs, newdentry->d_name.name, parent,
> > - newdentry->d_name.len);
> > - dput(newdentry);
> > - if (IS_ERR_OR_NULL(d))
> > + take_dentry_name_snapshot(&name, newdentry);
> > + end_creating_keep(newdentry);
> > + d = ovl_start_creating_upper(ofs, parent, &name.name);
> > + release_dentry_name_snapshot(&name);
>
> OK. not saying no to this (yet) but I have to admit that it is pretty
> ugly that the callers of ovl_create_real() want to create a specific
> stable name, which is could be passed in as const char *name
> and yet we end up doing this weird dance here just to keep the name
> from newdentry.
There are three callers of ovl_create_real()
ovl_lookup_or_create() does have a "const char *name".
ovl_create_upper() has a stable dentry from which it can copy a QSTR
ovl_create_temp() would need some sort of dance to keep hold of the
temporary name that was allocated.
If it weren't for ovl_create_temp() I would agree with you.
Though we could have the three callers of ovl_start_creating_temp() pass a
"char name[OVL_TEMPNAME_SIZE]" in, then ovl_create_temp() would have
easy access.
I could do that if you like.
Thanks,
NeilBrown
>
> Thanks,
> Amir.
>
Powered by blists - more mailing lists