linux-kernel - Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded strncmp()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9aeb7da4-0b94-4b53-9573-9f7fdcf142cc@igalia.com>
Date: Tue, 26 Aug 2025 17:01:06 -0300
From: André Almeida <andrealmeid@...lia.com>
To: Amir Goldstein <amir73il@...il.com>,
 Gabriel Krisman Bertazi <krisman@...e.de>
Cc: Miklos Szeredi <miklos@...redi.hu>, Theodore Tso <tytso@....edu>,
 linux-unionfs@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-fsdevel@...r.kernel.org, Alexander Viro <viro@...iv.linux.org.uk>,
 Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
 kernel-dev@...lia.com
Subject: Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded
 strncmp()

Em 26/08/2025 04:19, Amir Goldstein escreveu:
> On Tue, Aug 26, 2025 at 3:34 AM Gabriel Krisman Bertazi <krisman@...e.de> wrote:
>>
>> Gabriel Krisman Bertazi <gabriel@...sman.be> writes:
>>
>>> Amir Goldstein <amir73il@...il.com> writes:
>>>
>>>> On Mon, Aug 25, 2025 at 5:27 PM Amir Goldstein <amir73il@...il.com> wrote:
>>>>>
>>>>> On Mon, Aug 25, 2025 at 1:09 PM Gabriel Krisman Bertazi
>>>>> <gabriel@...sman.be> wrote:
>>>>>>
>>>>>> André Almeida <andrealmeid@...lia.com> writes:
>>>>>>
>>>>>>> To add overlayfs support casefold layers, create a new function
>>>>>>> ovl_casefold(), to be able to do case-insensitive strncmp().
>>>>>>>
>>>>>>> ovl_casefold() allocates a new buffer and stores the casefolded version
>>>>>>> of the string on it. If the allocation or the casefold operation fails,
>>>>>>> fallback to use the original string.
>>>>>>>
>>>>>>> The case-insentive name is then used in the rb-tree search/insertion
>>>>>>> operation. If the name is found in the rb-tree, the name can be
>>>>>>> discarded and the buffer is freed. If the name isn't found, it's then
>>>>>>> stored at struct ovl_cache_entry to be used later.
>>>>>>>
>>>>>>> Reviewed-by: Amir Goldstein <amir73il@...il.com>
>>>>>>> Signed-off-by: André Almeida <andrealmeid@...lia.com>
>>>>>>> ---
>>>>>>> Changes from v6:
>>>>>>>   - Last version was using `strncmp(... tmp->len)` which was causing
>>>>>>>     regressions. It should be `strncmp(... len)`.
>>>>>>>   - Rename cf_len to c_len
>>>>>>>   - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
>>>>>>>   - Remove needless kfree(cf_name)
>>>>>>> ---
>>>>>>>   fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
>>>>>>>   1 file changed, 94 insertions(+), 19 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
>>>>>>> index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
>>>>>>> --- a/fs/overlayfs/readdir.c
>>>>>>> +++ b/fs/overlayfs/readdir.c
>>>>>>> @@ -27,6 +27,8 @@ struct ovl_cache_entry {
>>>>>>>        bool is_upper;
>>>>>>>        bool is_whiteout;
>>>>>>>        bool check_xwhiteout;
>>>>>>> +     const char *c_name;
>>>>>>> +     int c_len;
>>>>>>>        char name[];
>>>>>>>   };
>>>>>>>
>>>>>>> @@ -45,6 +47,7 @@ struct ovl_readdir_data {
>>>>>>>        struct list_head *list;
>>>>>>>        struct list_head middle;
>>>>>>>        struct ovl_cache_entry *first_maybe_whiteout;
>>>>>>> +     struct unicode_map *map;
>>>>>>>        int count;
>>>>>>>        int err;
>>>>>>>        bool is_upper;
>>>>>>> @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
>>>>>>>        return rb_entry(n, struct ovl_cache_entry, node);
>>>>>>>   }
>>>>>>>
>>>>>>> +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
>>>>>>> +{
>>>>>>> +     const struct qstr qstr = { .name = str, .len = len };
>>>>>>> +     int cf_len;
>>>>>>> +
>>>>>>> +     if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
>>>>>>> +             return 0;
>>>>>>> +
>>>>>>> +     *dst = kmalloc(NAME_MAX, GFP_KERNEL);
>>>>>>> +
>>>>>>> +     if (dst) {
> 
> Andre,
> 
> Just noticed this is a bug, should have been if (*dst), but anyway following
> Gabriel's comments I have made this change in my tree (pending more
> strict related changes):
> 
> static int ovl_casefold(struct ovl_readdir_data *rdd, const char *str, int len,
>                          char **dst)
> {
>          const struct qstr qstr = { .name = str, .len = len };
>          char *cf_name;
>          int cf_len;
> 
>          if (!IS_ENABLED(CONFIG_UNICODE) || !rdd->map || is_dot_dotdot(str, len))
>                  return 0;
> 
>          cf_name = kmalloc(NAME_MAX, GFP_KERNEL);
>          if (!cf_name) {
>                  rdd->err = -ENOMEM;
>                  return -ENOMEM;
>          }
> 
>          cf_len = utf8_casefold(rdd->map, &qstr, *dst, NAME_MAX);
>          if (cf_len > 0)
>                  *dst = cf_name;
>          else
>                  kfree(cf_name);
> 
>          return cf_len;
> }

Right, that makes sense to me. I was unsure what to do regarding 
allocation fails, but this seems the right direction. Thanks!

> 
>>>>>>> +             cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
>>>>>>> +
>>>>>>> +             if (cf_len > 0)
>>>>>>> +                     return cf_len;
>>>>>>> +     }
>>>>>>> +
>>>>>>> +     kfree(*dst);
>>>>>>> +     return 0;
>>>>>>> +}
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I should just note this does not differentiates allocation errors from
>>>>>> casefolding errors (invalid encoding).  It might be just a theoretical
>>>>>> error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
>>>>>> operation is likely to fail too, but if you have an allocation failure, you
>>>>>> can end up with an inconsistent cache, because a file is added under the
>>>>>> !casefolded name and a later successful lookup will look for the
>>>>>> casefolded version.
>>>>>
>>>>> Good point.
>>>>> I will fix this in my tree.
>>>>
>>>> wait why should we not fail to fill the cache for both allocation
>>>> and encoding errors?
>>>>
>>>
>>> We shouldn't fail the cache for encoding errors, just for allocation errors.
>>>
>>> Perhaps I am misreading the code, so please correct me if I'm wrong.  if
>>> ovl_casefold fails, the non-casefolded name is used in the cache.  That
>>> makes sense if the reason utf8_casefold failed is because the string
>>> cannot be casefolded (i.e. an invalid utf-8 string). For those strings,
>>> everything is fine.  But on an allocation failure, the string might have
>>> a real casefolded version.  If we fallback to the original string as the
>>> key, a cache lookup won't find the entry, since we compare with memcmp.
> 
> Just to make it clear in case the name "cache lookup" confuses anyone
> on this thread - we are talking about ovl readdir cache, not about the vfs
> lookup cache, the the purpose of ovl readdir cache is twofold:
> 1. plain in-memory readdir cache
> 2. (more important to this discussion) implementation of "merged dir" content
> 
> So I agree with you that with non-strict mode, invalid encoded names
> should be added to readdir cache as is and not in the case of allocation
> failure.
> 
>>
>> I was thinking again about this and I suspect I misunderstood your
>> question.  let me try to answer it again:
>>
>> Ext4, f2fs and tmpfs all allow invalid utf8-encoded strings in a
>> casefolded directory when running on non-strict-mode.  They are treated
>> as non-encoded byte-sequences, as if they were seen on a case-Sensitive
>> directory.  They can't collide with other filenames because they
>> basically "fold" to themselves.
>>
>> Now I suspect there is another problem with this series: I don't see how
>> it implements the semantics of strict mode.  What happens if upper and
>> lower are in strict mode (which is valid, same encoding_flags) but there
>> is an invalid name in the lower?  overlayfs should reject the dentry,
>> because any attempt to create it to the upper will fail.
> 
> Ok, so IIUC, one issue is that return value from ovl_casefold() should be
> conditional to the sb encoding_flags, which was inherited from the layers.
> 
> Again, *IF* I understand correctly, then strict mode ext4 will not allow
> creating an invalid-encoded name, but will strict mode ext4 allow
> it as a valid lookup result?
> 
>>
>> André, did you consider this scenario?
> 
> In general, as I have told Andre from v1, please stick to the most common
> configs that people actually need.
> 
> We do NOT need to support every possible combination of layers configurations.
> 
> This is why we went with supporting all-or-nothing configs for casefolder dirs.
> Because it is simpler for overlayfs semantics and good enough for what
> users need.
> 
> So my question is to you both: do users actually use strict mode for
> wine and such?
> Because if they don't I would rather support the default mode only
> (enforced on mount)
> and add support for strict mode later per actual users demand.
> 

I agree with Gabriel, no need to add this for Wine. We can refuse to 
mount to make things easier.

>> You can test by creating a file
>> with an invalid-encoded name in a casefolded directory of a
>> non-strict-mode filesystem and then flip the strict-mode flag in the
>> superblock.  I can give it a try tomorrow too.
> 
> Can the sb flags be flipped in runtime? while mounted?
> I suppose you are talking about an offline change that requires
> re-mount of overlayfs and re-validate the same encoding flags on all layers?
> 
> Andre,
> 
> Please also add these and other casefold functional tests to fstest to
> validate correctness of the merge dir implementation with different
> casefold variants in different layers.
> 

Ok, I will add a test case to stress mounting layers with different 
encoding versions, flags and etc.