[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9aeb7da4-0b94-4b53-9573-9f7fdcf142cc@igalia.com>
Date: Tue, 26 Aug 2025 17:01:06 -0300
From: André Almeida <andrealmeid@...lia.com>
To: Amir Goldstein <amir73il@...il.com>,
Gabriel Krisman Bertazi <krisman@...e.de>
Cc: Miklos Szeredi <miklos@...redi.hu>, Theodore Tso <tytso@....edu>,
linux-unionfs@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
kernel-dev@...lia.com
Subject: Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded
strncmp()
Em 26/08/2025 04:19, Amir Goldstein escreveu:
> On Tue, Aug 26, 2025 at 3:34 AM Gabriel Krisman Bertazi <krisman@...e.de> wrote:
>>
>> Gabriel Krisman Bertazi <gabriel@...sman.be> writes:
>>
>>> Amir Goldstein <amir73il@...il.com> writes:
>>>
>>>> On Mon, Aug 25, 2025 at 5:27 PM Amir Goldstein <amir73il@...il.com> wrote:
>>>>>
>>>>> On Mon, Aug 25, 2025 at 1:09 PM Gabriel Krisman Bertazi
>>>>> <gabriel@...sman.be> wrote:
>>>>>>
>>>>>> André Almeida <andrealmeid@...lia.com> writes:
>>>>>>
>>>>>>> To add overlayfs support casefold layers, create a new function
>>>>>>> ovl_casefold(), to be able to do case-insensitive strncmp().
>>>>>>>
>>>>>>> ovl_casefold() allocates a new buffer and stores the casefolded version
>>>>>>> of the string on it. If the allocation or the casefold operation fails,
>>>>>>> fallback to use the original string.
>>>>>>>
>>>>>>> The case-insentive name is then used in the rb-tree search/insertion
>>>>>>> operation. If the name is found in the rb-tree, the name can be
>>>>>>> discarded and the buffer is freed. If the name isn't found, it's then
>>>>>>> stored at struct ovl_cache_entry to be used later.
>>>>>>>
>>>>>>> Reviewed-by: Amir Goldstein <amir73il@...il.com>
>>>>>>> Signed-off-by: André Almeida <andrealmeid@...lia.com>
>>>>>>> ---
>>>>>>> Changes from v6:
>>>>>>> - Last version was using `strncmp(... tmp->len)` which was causing
>>>>>>> regressions. It should be `strncmp(... len)`.
>>>>>>> - Rename cf_len to c_len
>>>>>>> - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
>>>>>>> - Remove needless kfree(cf_name)
>>>>>>> ---
>>>>>>> fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
>>>>>>> 1 file changed, 94 insertions(+), 19 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
>>>>>>> index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
>>>>>>> --- a/fs/overlayfs/readdir.c
>>>>>>> +++ b/fs/overlayfs/readdir.c
>>>>>>> @@ -27,6 +27,8 @@ struct ovl_cache_entry {
>>>>>>> bool is_upper;
>>>>>>> bool is_whiteout;
>>>>>>> bool check_xwhiteout;
>>>>>>> + const char *c_name;
>>>>>>> + int c_len;
>>>>>>> char name[];
>>>>>>> };
>>>>>>>
>>>>>>> @@ -45,6 +47,7 @@ struct ovl_readdir_data {
>>>>>>> struct list_head *list;
>>>>>>> struct list_head middle;
>>>>>>> struct ovl_cache_entry *first_maybe_whiteout;
>>>>>>> + struct unicode_map *map;
>>>>>>> int count;
>>>>>>> int err;
>>>>>>> bool is_upper;
>>>>>>> @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
>>>>>>> return rb_entry(n, struct ovl_cache_entry, node);
>>>>>>> }
>>>>>>>
>>>>>>> +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
>>>>>>> +{
>>>>>>> + const struct qstr qstr = { .name = str, .len = len };
>>>>>>> + int cf_len;
>>>>>>> +
>>>>>>> + if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
>>>>>>> + return 0;
>>>>>>> +
>>>>>>> + *dst = kmalloc(NAME_MAX, GFP_KERNEL);
>>>>>>> +
>>>>>>> + if (dst) {
>
> Andre,
>
> Just noticed this is a bug, should have been if (*dst), but anyway following
> Gabriel's comments I have made this change in my tree (pending more
> strict related changes):
>
> static int ovl_casefold(struct ovl_readdir_data *rdd, const char *str, int len,
> char **dst)
> {
> const struct qstr qstr = { .name = str, .len = len };
> char *cf_name;
> int cf_len;
>
> if (!IS_ENABLED(CONFIG_UNICODE) || !rdd->map || is_dot_dotdot(str, len))
> return 0;
>
> cf_name = kmalloc(NAME_MAX, GFP_KERNEL);
> if (!cf_name) {
> rdd->err = -ENOMEM;
> return -ENOMEM;
> }
>
> cf_len = utf8_casefold(rdd->map, &qstr, *dst, NAME_MAX);
> if (cf_len > 0)
> *dst = cf_name;
> else
> kfree(cf_name);
>
> return cf_len;
> }
Right, that makes sense to me. I was unsure what to do regarding
allocation fails, but this seems the right direction. Thanks!
>
>>>>>>> + cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
>>>>>>> +
>>>>>>> + if (cf_len > 0)
>>>>>>> + return cf_len;
>>>>>>> + }
>>>>>>> +
>>>>>>> + kfree(*dst);
>>>>>>> + return 0;
>>>>>>> +}
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I should just note this does not differentiates allocation errors from
>>>>>> casefolding errors (invalid encoding). It might be just a theoretical
>>>>>> error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
>>>>>> operation is likely to fail too, but if you have an allocation failure, you
>>>>>> can end up with an inconsistent cache, because a file is added under the
>>>>>> !casefolded name and a later successful lookup will look for the
>>>>>> casefolded version.
>>>>>
>>>>> Good point.
>>>>> I will fix this in my tree.
>>>>
>>>> wait why should we not fail to fill the cache for both allocation
>>>> and encoding errors?
>>>>
>>>
>>> We shouldn't fail the cache for encoding errors, just for allocation errors.
>>>
>>> Perhaps I am misreading the code, so please correct me if I'm wrong. if
>>> ovl_casefold fails, the non-casefolded name is used in the cache. That
>>> makes sense if the reason utf8_casefold failed is because the string
>>> cannot be casefolded (i.e. an invalid utf-8 string). For those strings,
>>> everything is fine. But on an allocation failure, the string might have
>>> a real casefolded version. If we fallback to the original string as the
>>> key, a cache lookup won't find the entry, since we compare with memcmp.
>
> Just to make it clear in case the name "cache lookup" confuses anyone
> on this thread - we are talking about ovl readdir cache, not about the vfs
> lookup cache, the the purpose of ovl readdir cache is twofold:
> 1. plain in-memory readdir cache
> 2. (more important to this discussion) implementation of "merged dir" content
>
> So I agree with you that with non-strict mode, invalid encoded names
> should be added to readdir cache as is and not in the case of allocation
> failure.
>
>>
>> I was thinking again about this and I suspect I misunderstood your
>> question. let me try to answer it again:
>>
>> Ext4, f2fs and tmpfs all allow invalid utf8-encoded strings in a
>> casefolded directory when running on non-strict-mode. They are treated
>> as non-encoded byte-sequences, as if they were seen on a case-Sensitive
>> directory. They can't collide with other filenames because they
>> basically "fold" to themselves.
>>
>> Now I suspect there is another problem with this series: I don't see how
>> it implements the semantics of strict mode. What happens if upper and
>> lower are in strict mode (which is valid, same encoding_flags) but there
>> is an invalid name in the lower? overlayfs should reject the dentry,
>> because any attempt to create it to the upper will fail.
>
> Ok, so IIUC, one issue is that return value from ovl_casefold() should be
> conditional to the sb encoding_flags, which was inherited from the layers.
>
> Again, *IF* I understand correctly, then strict mode ext4 will not allow
> creating an invalid-encoded name, but will strict mode ext4 allow
> it as a valid lookup result?
>
>>
>> André, did you consider this scenario?
>
> In general, as I have told Andre from v1, please stick to the most common
> configs that people actually need.
>
> We do NOT need to support every possible combination of layers configurations.
>
> This is why we went with supporting all-or-nothing configs for casefolder dirs.
> Because it is simpler for overlayfs semantics and good enough for what
> users need.
>
> So my question is to you both: do users actually use strict mode for
> wine and such?
> Because if they don't I would rather support the default mode only
> (enforced on mount)
> and add support for strict mode later per actual users demand.
>
I agree with Gabriel, no need to add this for Wine. We can refuse to
mount to make things easier.
>> You can test by creating a file
>> with an invalid-encoded name in a casefolded directory of a
>> non-strict-mode filesystem and then flip the strict-mode flag in the
>> superblock. I can give it a try tomorrow too.
>
> Can the sb flags be flipped in runtime? while mounted?
> I suppose you are talking about an offline change that requires
> re-mount of overlayfs and re-validate the same encoding flags on all layers?
>
> Andre,
>
> Please also add these and other casefold functional tests to fstest to
> validate correctness of the merge dir implementation with different
> casefold variants in different layers.
>
Ok, I will add a test case to stress mounting layers with different
encoding versions, flags and etc.
Powered by blists - more mailing lists