[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOQ4uxjv199LB4XhgeSbTc9VkPB16S86vwcz9tq4GHVX4eVx-w@mail.gmail.com>
Date: Fri, 11 Jul 2025 11:46:30 +0200
From: Amir Goldstein <amir73il@...il.com>
To: André Almeida <andrealmeid@...lia.com>,
Gabriel Krisman Bertazi <krisman@...nel.org>
Cc: Miklos Szeredi <miklos@...redi.hu>, Theodore Tso <tytso@....edu>, linux-unionfs@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
Alexander Viro <viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
kernel-dev@...lia.com
Subject: Re: [PATCH 1/3] ovl: Make ovl_cache_entry_find support casefold
On Wed, Apr 9, 2025 at 5:01 PM André Almeida <andrealmeid@...lia.com> wrote:
>
> To add overlayfs support casefold filesystems, make
> ovl_cache_entry_find() support casefold dentries.
>
> For the casefold support, just comparing the strings does not work
> because we need the dentry enconding, so make this function find the
> equivalent dentry for a giving directory, if any.
>
> Also, if two strings are not equal, strncmp() return value sign can be
> either positive or negative and this information can be used to optimize
> the walk in the rb tree. utf8_strncmp(), in the other hand, just return
> true or false, so replace the rb walk with a normal rb_next() function.
You cannot just replace a more performance implementation with a
less performant one for everyone else just for your niche use case.
Also it is the wrong approach.
This code needs to use utf8_normalize() to store the normalized
name in the rbtree instead of doing lookup and d_same_name().
and you need to do ovl_cache_entry_add_rb() with the normalized
name anotherwise you break the logic of ovl_dir_read_merged().
Gabriel,
Do you think it makes sense to use utf8_normalize() from this code
directly to generate a key for "is this name found in another layer"
search tree?
I see that utf8_normalize() has zero users, so I guess there was
an intention to use it for things like that?
More nits below, but they will not be relevant once you use the normalized name.
Thanks,
Amir.
>
> Signed-off-by: André Almeida <andrealmeid@...lia.com>
> ---
> fs/overlayfs/ovl_entry.h | 1 +
> fs/overlayfs/readdir.c | 32 +++++++++++++++++++++-----------
> 2 files changed, 22 insertions(+), 11 deletions(-)
>
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index cb449ab310a7a89aafa0ee04ee7ff6c8141dd7d5..2ee52da85ba3e3fd704415a7ee4e9b7da88bb019 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -90,6 +90,7 @@ struct ovl_fs {
> bool no_shared_whiteout;
> /* r/o snapshot of upperdir sb's only taken on volatile mounts */
> errseq_t errseq;
> + bool casefold;
> };
>
> /* Number of lower layers, not including data-only layers */
> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
> index 881ec5592da52dfb27a588496582e7084b2fbd3b..68f4a83713e9beab604fd23319d60567ef1feeca 100644
> --- a/fs/overlayfs/readdir.c
> +++ b/fs/overlayfs/readdir.c
> @@ -92,21 +92,31 @@ static bool ovl_cache_entry_find_link(const char *name, int len,
> }
>
> static struct ovl_cache_entry *ovl_cache_entry_find(struct rb_root *root,
> - const char *name, int len)
> + const char *name, int len,
> + struct dentry *upper)
This is not an "upper" it is an overlayfs dentry that we call "dentry"
> {
> + struct ovl_fs *ofs = OVL_FS(upper->d_sb);
OVL_FS(upper) is never correct, because OVL_FS() is only applicable to
ovl dentries.
> struct rb_node *node = root->rb_node;
> - int cmp;
> + struct qstr q = { .name = name, .len = len };
>
> while (node) {
> struct ovl_cache_entry *p = ovl_cache_entry_from_node(node);
> + struct dentry *p_dentry, *real_dentry = NULL;
> +
> + if (ofs->casefold && upper) {
> + p_dentry = ovl_lookup_upper(ofs, p->name, upper, p->len);
and here you are mixing a helper to lookup in underlying upper fs with
an overlayfs dentry. You should not do lookup in this context at all.
> + if (!IS_ERR(p_dentry)) {
> + real_dentry = ovl_dentry_real(p_dentry);
> + if (d_same_name(real_dentry, real_dentry->d_parent, &q))
> + return p;
> + }
> + }
>
> - cmp = strncmp(name, p->name, len);
> - if (cmp > 0)
> - node = p->node.rb_right;
> - else if (cmp < 0 || len < p->len)
> - node = p->node.rb_left;
> - else
> - return p;
> + if (!real_dentry)
> + if (!strncmp(name, p->name, len))
> + return p;
> +
> + node = rb_next(&p->node);
As I wrote this change is wrong and unneeded when using normalized names.
Thanks,
Amir.
Powered by blists - more mailing lists