[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOQ4uxhJfFgpUKHy0c23i0dsvxZoRuGxMVXbasEn3zf3s0ORYg@mail.gmail.com>
Date: Thu, 28 Aug 2025 09:25:07 +0200
From: Amir Goldstein <amir73il@...il.com>
To: Gabriel Krisman Bertazi <gabriel@...sman.be>, NeilBrown <neil@...wn.name>
Cc: André Almeida <andrealmeid@...lia.com>,
Miklos Szeredi <miklos@...redi.hu>, Theodore Tso <tytso@....edu>, linux-unionfs@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
Alexander Viro <viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
kernel-dev@...lia.com
Subject: Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
On Thu, Aug 28, 2025 at 5:15 AM Gabriel Krisman Bertazi
<gabriel@...sman.be> wrote:
>
> "NeilBrown" <neil@...wn.name> writes:
>
> > On Thu, 28 Aug 2025, Amir Goldstein wrote:
> >> On Tue, Aug 26, 2025 at 9:01 PM André Almeida <andrealmeid@...lia.com> wrote:
> >> >
> >> >
> >> >
> >> > Em 26/08/2025 04:31, Amir Goldstein escreveu:
> >> > > On Mon, Aug 25, 2025 at 3:31 PM André Almeida <andrealmeid@...lia.com> wrote:
> >> > >>
> >> > >> Hi Amir,
> >> > >>
> >> > >> Em 22/08/2025 16:17, Amir Goldstein escreveu:
> >> > >>
> >> > >> [...]
> >> > >>
> >> > >> /*
> >> > >>>>>> - * Allow filesystems that are case-folding capable but deny composing
> >> > >>>>>> - * ovl stack from case-folded directories.
> >> > >>>>>> + * Exceptionally for layers with casefold, we accept that they have
> >> > >>>>>> + * their own hash and compare operations
> >> > >>>>>> */
> >> > >>>>>> - if (sb_has_encoding(dentry->d_sb))
> >> > >>>>>> - return IS_CASEFOLDED(d_inode(dentry));
> >> > >>>>>> + if (ofs->casefold)
> >> > >>>>>> + return false;
> >> > >>>>>
> >> > >>>>> I think this is better as:
> >> > >>>>> if (sb_has_encoding(dentry->d_sb))
> >> > >>>>> return false;
> >> > >>>>>
> >> > >>>
> >> > >>> And this still fails the test "Casefold enabled" for me.
> >> > >>>
> >> > >>> Maybe you are confused because this does not look like
> >> > >>> a test failure. It looks like this:
> >> > >>>
> >> > >>> generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
> >> > >>> in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
> >> > >>> casefold
> >> > >>> [ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
> >> > >>> name='subdir', err=-116): parent wrong casefold
> >> > >>> [ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
> >> > >>> name='casefold', err=-66): child wrong casefold
> >> > >>> [19:10:24] [not run]
> >> > >>> generic/999 -- overlayfs does not support casefold enabled layers
> >> > >>> Ran: generic/999
> >> > >>> Not run: generic/999
> >> > >>> Passed all 1 tests
> >> > >>>
> >> > >>
> >> > >> This is how the test output looks before my changes[1] to the test:
> >> > >>
> >> > >> $ ./run.sh
> >> > >> FSTYP -- ext4
> >> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> >> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> >> > >> MKFS_OPTIONS -- -F /dev/vdc
> >> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> >> > >>
> >> > >> generic/999 1s ... [not run] overlayfs does not support casefold enabled
> >> > >> layers
> >> > >> Ran: generic/999
> >> > >> Not run: generic/999
> >> > >> Passed all 1 tests
> >> > >>
> >> > >>
> >> > >> And this is how it looks after my changes[1] to the test:
> >> > >>
> >> > >> $ ./run.sh
> >> > >> FSTYP -- ext4
> >> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> >> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> >> > >> MKFS_OPTIONS -- -F /dev/vdc
> >> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> >> > >>
> >> > >> generic/999 1s
> >> > >> Ran: generic/999
> >> > >> Passed all 1 tests
> >> > >>
> >> > >> So, as far as I can tell, the casefold enabled is not being skipped
> >> > >> after the fix to the test.
> >> > >
> >> > > Is this how it looks with your v6 or after fixing the bug:
> >> > > https://lore.kernel.org/linux-unionfs/68a8c4d7.050a0220.37038e.005c.GAE@google.com/
> >> > >
> >> > > Because for me this skipping started after fixing this bug
> >> > > Maybe we fixed the bug incorrectly, but I did not see what the problem
> >> > > was from a quick look.
> >> > >
> >> > > Can you test with my branch:
> >> > > https://github.com/amir73il/linux/commits/ovl_casefold/
> >> > >
> >> >
> >> > Right, our branches have a different base, mine is older and based on
> >> > the tag vfs/vfs-6.18.mount.
> >> >
> >> > I have now tested with your branch, and indeed the test fails with
> >> > "overlayfs does not support casefold enabled". I did some debugging and
> >> > the missing commit from my branch that is making this difference here is
> >> > e8bd877fb76bb9f3 ("ovl: fix possible double unlink"). After reverting it
> >> > on top of your branch, the test works. I'm not sure yet why this
> >> > prevents the mount, but this is the call trace when the error happens:
> >>
> >> Wow, that is an interesting development race...
> >>
> >> >
> >> > TID/PID 860/860 (mount/mount):
> >> >
> >> > entry_SYSCALL_64_after_hwframe+0x77
> >> > do_syscall_64+0xa2
> >> > x64_sys_call+0x1bc3
> >> > __x64_sys_fsconfig+0x46c
> >> > vfs_cmd_create+0x60
> >> > vfs_get_tree+0x2e
> >> > ovl_get_tree+0x19
> >> > get_tree_nodev+0x70
> >> > ovl_fill_super+0x53b
> >> > ! 0us [-EINVAL] ovl_parent_lock
> >> >
> >> > And for the ovl_parent_lock() arguments, *parent="work", *child="#7". So
> >> > right now I'm trying to figure out why the dentry for #7 is not hashed.
> >> >
> >>
> >> The reason is this:
> >>
> >> static struct dentry *ext4_lookup(...
> >> {
> >> ...
> >> if (IS_ENABLED(CONFIG_UNICODE) && !inode && IS_CASEFOLDED(dir)) {
> >> /* Eventually we want to call d_add_ci(dentry, NULL)
> >> * for negative dentries in the encoding case as
> >> * well. For now, prevent the negative dentry
> >> * from being cached.
> >> */
> >> return NULL;
> >> }
> >>
> >> return d_splice_alias(inode, dentry);
> >> }
> >>
> >> Neil,
> >>
> >> Apparently, the assumption that
> >> ovl_lookup_temp() => ovl_lookup_upper() => lookup_one()
> >> returns a hashed dentry is not always true.
> >>
> >> It may be always true for all the filesystems that are currently
> >> supported as an overlayfs
> >> upper layer fs (?), but it does not look like you can count on this
> >> for the wider vfs effort
> >> and we should try to come up with a solution for ovl_parent_lock()
> >> that will allow enabling
> >> casefolding on overlayfs layers.
> >>
> >> This patch seems to work. WDYT?
> >>
> >> Thanks,
> >> Amir.
> >>
> >> commit 5dfcd10378038637648f3f422e3d5097eb6faa5f
> >> Author: Amir Goldstein <amir73il@...il.com>
> >> Date: Wed Aug 27 19:55:26 2025 +0200
> >>
> >> ovl: adapt ovl_parent_lock() to casefolded directories
> >>
> >> e8bd877fb76bb9f3 ("ovl: fix possible double unlink") added a sanity
> >> check of !d_unhashed(child) to try to verify that child dentry was not
> >> unlinked while parent dir was unlocked.
> >>
> >> This "was not unlink" check has a false positive result in the case of
> >> casefolded parent dir, because in that case, ovl_create_temp() returns
> >> an unhashed dentry.
> >>
> >> Change the "was not unlinked" check to use cant_mount(child).
> >> cant_mount(child) means that child was unlinked while we have been
> >> holding a reference to child, so it could not have become negative.
> >>
> >> This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
> >> after ovl_create_temp() and allows mount of overlayfs with casefolding
> >> enabled layers.
> >>
> >> Reported-by: André Almeida <andrealmeid@...lia.com>
> >> Link: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
> >> Signed-off-by: Amir Goldstein <amir73il@...il.com>
> >>
> >> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> >> index bec4a39d1b97c..bffbb59776720 100644
> >> --- a/fs/overlayfs/util.c
> >> +++ b/fs/overlayfs/util.c
> >> @@ -1551,9 +1551,23 @@ void ovl_copyattr(struct inode *inode)
> >>
> >> int ovl_parent_lock(struct dentry *parent, struct dentry *child)
> >> {
> >> + bool is_unlinked;
> >> +
> >> inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> >> - if (!child ||
> >> - (!d_unhashed(child) && child->d_parent == parent))
> >> + if (!child)
> >> + return 0;
> >> +
> >> + /*
> >> + * After re-acquiring parent dir lock, verify that child was not moved
> >> + * to another parent and that it was not unlinked. cant_mount() means
> >> + * that child was unlinked while parent was unlocked. Since we are
> >> + * holding a reference to child, it could not have become negative.
> >> + * d_unhashed(child) is not a strong enough indication for unlinked,
> >> + * because with casefolded parent dir, ovl_create_temp() returns an
> >> + * unhashed dentry.
> >> + */
> >> + is_unlinked = cant_mount(child) || WARN_ON_ONCE(d_is_negative(child));
> >> + if (!is_unlinked && child->d_parent == parent)
> >> return 0;
> >>
> >> inode_unlock(parent->d_inode);
> >>
> >
> > I don't feel comfortable with that. Letting ovl_parent_lock() succeed
> > on an unhashed dentry doesn't work for my longer term plans for locking.
> > I would really rather we got that dentry hashed.
> >
> > What is happening is :
> > - lookup on non-existent name -> unhashed dentry
> > - vfs_create on that dentry - still unhashed
> > - rename of that unhashed dentry -> confusion in ovl_parent_lock()
> >
> > If this were being done from user-space there would be another lookup
> > after the create and before the rename, and that would result in a
> > hashed dentry.
> >
> > Could ovl_create_real() do a lookup for the name if the dentry isn't
> > hashed? That should result in a dentry that can safely be passed to
> > ovl_parent_lock().
>
> Might be a good time to mention I have a branch enabling negative
> dentries in casefolded directories. It didn't have any major issues last
> time I posted, but it didn't get much interest. It should be enough to
> resolve the unhashed dentries after a lookup due to casefolding.
>
> I'd need to revisit and retest, but it is a way out of it.
>
That's definitely a way out, but I don't know if it's needed to unblock the
ovl_casefold work.
I will try Neil's suggestion because it makes sense.
Neil,
FYI, if your future work for vfs assumes that fs will alway have the
dentry hashed after create, you may want to look at:
static int ovl_instantiate(struct dentry *dentry, struct inode *inode,
...
/* Force lookup of new upper hardlink to find its lower */
if (hardlink)
d_drop(dentry);
return 0;
}
If your assumption is not true for overlayfs, it may not be true for other fs
as well. How could you verify that it is correct?
I really hope that you have some opt-in strategy in mind, so those new
dirops assumptions would not have to include all possible filesystems.
Thanks,
Amir.
Powered by blists - more mailing lists