lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87plci3lxw.fsf@mailhost.krisman.be>
Date: Mon, 25 Aug 2025 21:34:35 -0400
From: Gabriel Krisman Bertazi <krisman@...e.de>
To: Amir Goldstein <amir73il@...il.com>
Cc: André Almeida <andrealmeid@...lia.com>,  Miklos Szeredi
 <miklos@...redi.hu>,  Theodore Tso <tytso@....edu>,
  linux-unionfs@...r.kernel.org,  linux-kernel@...r.kernel.org,
  linux-fsdevel@...r.kernel.org,  Alexander Viro <viro@...iv.linux.org.uk>,
  Christian Brauner <brauner@...nel.org>,  Jan Kara <jack@...e.cz>,
  kernel-dev@...lia.com
Subject: Re: [PATCH v6 4/9] ovl: Create ovl_casefold() to support casefolded
 strncmp()

Gabriel Krisman Bertazi <gabriel@...sman.be> writes:

> Amir Goldstein <amir73il@...il.com> writes:
>
>> On Mon, Aug 25, 2025 at 5:27 PM Amir Goldstein <amir73il@...il.com> wrote:
>>>
>>> On Mon, Aug 25, 2025 at 1:09 PM Gabriel Krisman Bertazi
>>> <gabriel@...sman.be> wrote:
>>> >
>>> > André Almeida <andrealmeid@...lia.com> writes:
>>> >
>>> > > To add overlayfs support casefold layers, create a new function
>>> > > ovl_casefold(), to be able to do case-insensitive strncmp().
>>> > >
>>> > > ovl_casefold() allocates a new buffer and stores the casefolded version
>>> > > of the string on it. If the allocation or the casefold operation fails,
>>> > > fallback to use the original string.
>>> > >
>>> > > The case-insentive name is then used in the rb-tree search/insertion
>>> > > operation. If the name is found in the rb-tree, the name can be
>>> > > discarded and the buffer is freed. If the name isn't found, it's then
>>> > > stored at struct ovl_cache_entry to be used later.
>>> > >
>>> > > Reviewed-by: Amir Goldstein <amir73il@...il.com>
>>> > > Signed-off-by: André Almeida <andrealmeid@...lia.com>
>>> > > ---
>>> > > Changes from v6:
>>> > >  - Last version was using `strncmp(... tmp->len)` which was causing
>>> > >    regressions. It should be `strncmp(... len)`.
>>> > >  - Rename cf_len to c_len
>>> > >  - Use c_len for tree operation: (cmp < 0 || len < tmp->c_len)
>>> > >  - Remove needless kfree(cf_name)
>>> > > ---
>>> > >  fs/overlayfs/readdir.c | 113 ++++++++++++++++++++++++++++++++++++++++---------
>>> > >  1 file changed, 94 insertions(+), 19 deletions(-)
>>> > >
>>> > > diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
>>> > > index b65cdfce31ce27172d28d879559f1008b9c87320..dfc661b7bc3f87efbf14991e97cee169400d823b 100644
>>> > > --- a/fs/overlayfs/readdir.c
>>> > > +++ b/fs/overlayfs/readdir.c
>>> > > @@ -27,6 +27,8 @@ struct ovl_cache_entry {
>>> > >       bool is_upper;
>>> > >       bool is_whiteout;
>>> > >       bool check_xwhiteout;
>>> > > +     const char *c_name;
>>> > > +     int c_len;
>>> > >       char name[];
>>> > >  };
>>> > >
>>> > > @@ -45,6 +47,7 @@ struct ovl_readdir_data {
>>> > >       struct list_head *list;
>>> > >       struct list_head middle;
>>> > >       struct ovl_cache_entry *first_maybe_whiteout;
>>> > > +     struct unicode_map *map;
>>> > >       int count;
>>> > >       int err;
>>> > >       bool is_upper;
>>> > > @@ -66,6 +69,27 @@ static struct ovl_cache_entry *ovl_cache_entry_from_node(struct rb_node *n)
>>> > >       return rb_entry(n, struct ovl_cache_entry, node);
>>> > >  }
>>> > >
>>> > > +static int ovl_casefold(struct unicode_map *map, const char *str, int len, char **dst)
>>> > > +{
>>> > > +     const struct qstr qstr = { .name = str, .len = len };
>>> > > +     int cf_len;
>>> > > +
>>> > > +     if (!IS_ENABLED(CONFIG_UNICODE) || !map || is_dot_dotdot(str, len))
>>> > > +             return 0;
>>> > > +
>>> > > +     *dst = kmalloc(NAME_MAX, GFP_KERNEL);
>>> > > +
>>> > > +     if (dst) {
>>> > > +             cf_len = utf8_casefold(map, &qstr, *dst, NAME_MAX);
>>> > > +
>>> > > +             if (cf_len > 0)
>>> > > +                     return cf_len;
>>> > > +     }
>>> > > +
>>> > > +     kfree(*dst);
>>> > > +     return 0;
>>> > > +}
>>> >
>>> > Hi,
>>> >
>>> > I should just note this does not differentiates allocation errors from
>>> > casefolding errors (invalid encoding).  It might be just a theoretical
>>> > error because GFP_KERNEL shouldn't fail (wink, wink) and the rest of the
>>> > operation is likely to fail too, but if you have an allocation failure, you
>>> > can end up with an inconsistent cache, because a file is added under the
>>> > !casefolded name and a later successful lookup will look for the
>>> > casefolded version.
>>>
>>> Good point.
>>> I will fix this in my tree.
>>
>> wait why should we not fail to fill the cache for both allocation
>> and encoding errors?
>>
>
> We shouldn't fail the cache for encoding errors, just for allocation errors.
>
> Perhaps I am misreading the code, so please correct me if I'm wrong.  if
> ovl_casefold fails, the non-casefolded name is used in the cache.  That
> makes sense if the reason utf8_casefold failed is because the string
> cannot be casefolded (i.e. an invalid utf-8 string). For those strings,
> everything is fine.  But on an allocation failure, the string might have
> a real casefolded version.  If we fallback to the original string as the
> key, a cache lookup won't find the entry, since we compare with memcmp.

I was thinking again about this and I suspect I misunderstood your
question.  let me try to answer it again:

Ext4, f2fs and tmpfs all allow invalid utf8-encoded strings in a
casefolded directory when running on non-strict-mode.  They are treated
as non-encoded byte-sequences, as if they were seen on a case-Sensitive
directory.  They can't collide with other filenames because they
basically "fold" to themselves.

Now I suspect there is another problem with this series: I don't see how
it implements the semantics of strict mode.  What happens if upper and
lower are in strict mode (which is valid, same encoding_flags) but there
is an invalid name in the lower?  overlayfs should reject the dentry,
because any attempt to create it to the upper will fail.

André, did you consider this scenario?  You can test by creating a file
with an invalid-encoded name in a casefolded directory of a
non-strict-mode filesystem and then flip the strict-mode flag in the
superblock.  I can give it a try tomorrow too.

Thanks,

-- 
Gabriel Krisman Bertazi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ