[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <175633908557.2234665.14959580663322237611@noble.neil.brown.name>
Date: Thu, 28 Aug 2025 09:58:05 +1000
From: "NeilBrown" <neil@...wn.name>
To: "Amir Goldstein" <amir73il@...il.com>
Cc: André Almeida <andrealmeid@...lia.com>,
"Miklos Szeredi" <miklos@...redi.hu>, "Theodore Tso" <tytso@....edu>,
"Gabriel Krisman Bertazi" <krisman@...nel.org>, linux-unionfs@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
"Alexander Viro" <viro@...iv.linux.org.uk>,
"Christian Brauner" <brauner@...nel.org>, "Jan Kara" <jack@...e.cz>,
kernel-dev@...lia.com
Subject:
Re: [PATCH v6 9/9] ovl: Support mounting case-insensitive enabled layers
On Thu, 28 Aug 2025, Amir Goldstein wrote:
> On Tue, Aug 26, 2025 at 9:01 PM André Almeida <andrealmeid@...lia.com> wrote:
> >
> >
> >
> > Em 26/08/2025 04:31, Amir Goldstein escreveu:
> > > On Mon, Aug 25, 2025 at 3:31 PM André Almeida <andrealmeid@...lia.com> wrote:
> > >>
> > >> Hi Amir,
> > >>
> > >> Em 22/08/2025 16:17, Amir Goldstein escreveu:
> > >>
> > >> [...]
> > >>
> > >> /*
> > >>>>>> - * Allow filesystems that are case-folding capable but deny composing
> > >>>>>> - * ovl stack from case-folded directories.
> > >>>>>> + * Exceptionally for layers with casefold, we accept that they have
> > >>>>>> + * their own hash and compare operations
> > >>>>>> */
> > >>>>>> - if (sb_has_encoding(dentry->d_sb))
> > >>>>>> - return IS_CASEFOLDED(d_inode(dentry));
> > >>>>>> + if (ofs->casefold)
> > >>>>>> + return false;
> > >>>>>
> > >>>>> I think this is better as:
> > >>>>> if (sb_has_encoding(dentry->d_sb))
> > >>>>> return false;
> > >>>>>
> > >>>
> > >>> And this still fails the test "Casefold enabled" for me.
> > >>>
> > >>> Maybe you are confused because this does not look like
> > >>> a test failure. It looks like this:
> > >>>
> > >>> generic/999 5s ... [19:10:21][ 150.667994] overlayfs: failed lookup
> > >>> in lower (ovl-lower/casefold, name='subdir', err=-116): parent wrong
> > >>> casefold
> > >>> [ 150.669741] overlayfs: failed lookup in lower (ovl-lower/casefold,
> > >>> name='subdir', err=-116): parent wrong casefold
> > >>> [ 150.760644] overlayfs: failed lookup in lower (/ovl-lower,
> > >>> name='casefold', err=-66): child wrong casefold
> > >>> [19:10:24] [not run]
> > >>> generic/999 -- overlayfs does not support casefold enabled layers
> > >>> Ran: generic/999
> > >>> Not run: generic/999
> > >>> Passed all 1 tests
> > >>>
> > >>
> > >> This is how the test output looks before my changes[1] to the test:
> > >>
> > >> $ ./run.sh
> > >> FSTYP -- ext4
> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> > >> MKFS_OPTIONS -- -F /dev/vdc
> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> > >>
> > >> generic/999 1s ... [not run] overlayfs does not support casefold enabled
> > >> layers
> > >> Ran: generic/999
> > >> Not run: generic/999
> > >> Passed all 1 tests
> > >>
> > >>
> > >> And this is how it looks after my changes[1] to the test:
> > >>
> > >> $ ./run.sh
> > >> FSTYP -- ext4
> > >> PLATFORM -- Linux/x86_64 archlinux 6.17.0-rc1+ #1174 SMP
> > >> PREEMPT_DYNAMIC Mon Aug 25 10:18:09 -03 2025
> > >> MKFS_OPTIONS -- -F /dev/vdc
> > >> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /tmp/dir2
> > >>
> > >> generic/999 1s
> > >> Ran: generic/999
> > >> Passed all 1 tests
> > >>
> > >> So, as far as I can tell, the casefold enabled is not being skipped
> > >> after the fix to the test.
> > >
> > > Is this how it looks with your v6 or after fixing the bug:
> > > https://lore.kernel.org/linux-unionfs/68a8c4d7.050a0220.37038e.005c.GAE@google.com/
> > >
> > > Because for me this skipping started after fixing this bug
> > > Maybe we fixed the bug incorrectly, but I did not see what the problem
> > > was from a quick look.
> > >
> > > Can you test with my branch:
> > > https://github.com/amir73il/linux/commits/ovl_casefold/
> > >
> >
> > Right, our branches have a different base, mine is older and based on
> > the tag vfs/vfs-6.18.mount.
> >
> > I have now tested with your branch, and indeed the test fails with
> > "overlayfs does not support casefold enabled". I did some debugging and
> > the missing commit from my branch that is making this difference here is
> > e8bd877fb76bb9f3 ("ovl: fix possible double unlink"). After reverting it
> > on top of your branch, the test works. I'm not sure yet why this
> > prevents the mount, but this is the call trace when the error happens:
>
> Wow, that is an interesting development race...
>
> >
> > TID/PID 860/860 (mount/mount):
> >
> > entry_SYSCALL_64_after_hwframe+0x77
> > do_syscall_64+0xa2
> > x64_sys_call+0x1bc3
> > __x64_sys_fsconfig+0x46c
> > vfs_cmd_create+0x60
> > vfs_get_tree+0x2e
> > ovl_get_tree+0x19
> > get_tree_nodev+0x70
> > ovl_fill_super+0x53b
> > ! 0us [-EINVAL] ovl_parent_lock
> >
> > And for the ovl_parent_lock() arguments, *parent="work", *child="#7". So
> > right now I'm trying to figure out why the dentry for #7 is not hashed.
> >
>
> The reason is this:
>
> static struct dentry *ext4_lookup(...
> {
> ...
> if (IS_ENABLED(CONFIG_UNICODE) && !inode && IS_CASEFOLDED(dir)) {
> /* Eventually we want to call d_add_ci(dentry, NULL)
> * for negative dentries in the encoding case as
> * well. For now, prevent the negative dentry
> * from being cached.
> */
> return NULL;
> }
>
> return d_splice_alias(inode, dentry);
> }
>
> Neil,
>
> Apparently, the assumption that
> ovl_lookup_temp() => ovl_lookup_upper() => lookup_one()
> returns a hashed dentry is not always true.
>
> It may be always true for all the filesystems that are currently
> supported as an overlayfs
> upper layer fs (?), but it does not look like you can count on this
> for the wider vfs effort
> and we should try to come up with a solution for ovl_parent_lock()
> that will allow enabling
> casefolding on overlayfs layers.
>
> This patch seems to work. WDYT?
>
> Thanks,
> Amir.
>
> commit 5dfcd10378038637648f3f422e3d5097eb6faa5f
> Author: Amir Goldstein <amir73il@...il.com>
> Date: Wed Aug 27 19:55:26 2025 +0200
>
> ovl: adapt ovl_parent_lock() to casefolded directories
>
> e8bd877fb76bb9f3 ("ovl: fix possible double unlink") added a sanity
> check of !d_unhashed(child) to try to verify that child dentry was not
> unlinked while parent dir was unlocked.
>
> This "was not unlink" check has a false positive result in the case of
> casefolded parent dir, because in that case, ovl_create_temp() returns
> an unhashed dentry.
>
> Change the "was not unlinked" check to use cant_mount(child).
> cant_mount(child) means that child was unlinked while we have been
> holding a reference to child, so it could not have become negative.
>
> This fixes the error in ovl_parent_lock() in ovl_check_rename_whiteout()
> after ovl_create_temp() and allows mount of overlayfs with casefolding
> enabled layers.
>
> Reported-by: André Almeida <andrealmeid@...lia.com>
> Link: https://lore.kernel.org/r/18704e8c-c734-43f3-bc7c-b8be345e1bf5@igalia.com/
> Signed-off-by: Amir Goldstein <amir73il@...il.com>
>
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index bec4a39d1b97c..bffbb59776720 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -1551,9 +1551,23 @@ void ovl_copyattr(struct inode *inode)
>
> int ovl_parent_lock(struct dentry *parent, struct dentry *child)
> {
> + bool is_unlinked;
> +
> inode_lock_nested(parent->d_inode, I_MUTEX_PARENT);
> - if (!child ||
> - (!d_unhashed(child) && child->d_parent == parent))
> + if (!child)
> + return 0;
> +
> + /*
> + * After re-acquiring parent dir lock, verify that child was not moved
> + * to another parent and that it was not unlinked. cant_mount() means
> + * that child was unlinked while parent was unlocked. Since we are
> + * holding a reference to child, it could not have become negative.
> + * d_unhashed(child) is not a strong enough indication for unlinked,
> + * because with casefolded parent dir, ovl_create_temp() returns an
> + * unhashed dentry.
> + */
> + is_unlinked = cant_mount(child) || WARN_ON_ONCE(d_is_negative(child));
> + if (!is_unlinked && child->d_parent == parent)
> return 0;
>
> inode_unlock(parent->d_inode);
>
I don't feel comfortable with that. Letting ovl_parent_lock() succeed
on an unhashed dentry doesn't work for my longer term plans for locking.
I would really rather we got that dentry hashed.
What is happening is :
- lookup on non-existent name -> unhashed dentry
- vfs_create on that dentry - still unhashed
- rename of that unhashed dentry -> confusion in ovl_parent_lock()
If this were being done from user-space there would be another lookup
after the create and before the rename, and that would result in a
hashed dentry.
Could ovl_create_real() do a lookup for the name if the dentry isn't
hashed? That should result in a dentry that can safely be passed to
ovl_parent_lock().
NeilBrown
Powered by blists - more mailing lists