linux-kernel - Re: [PATCH v9 1/4] unicode: Add utf8_casefold

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87h7v1gi6q.fsf@collabora.com>
Date:   Wed, 24 Jun 2020 01:13:17 -0400
From:   Gabriel Krisman Bertazi <krisman@...labora.com>
To:     Daniel Rosenberg <drosen@...gle.com>
Cc:     "Theodore Ts'o" <tytso@....edu>, linux-ext4@...r.kernel.org,
        Jaegeuk Kim <jaegeuk@...nel.org>, Chao Yu <chao@...nel.org>,
        linux-f2fs-devel@...ts.sourceforge.net,
        Eric Biggers <ebiggers@...nel.org>,
        linux-fscrypt@...r.kernel.org,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Richard Weinberger <richard@....at>,
        linux-mtd@...ts.infradead.org,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Jonathan Corbet <corbet@....net>, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        kernel-team@...roid.com
Subject: Re: [PATCH v9 1/4] unicode: Add utf8_casefold_hash

Daniel Rosenberg <drosen@...gle.com> writes:

> This adds a case insensitive hash function to allow taking the hash
> without needing to allocate a casefolded copy of the string.
>
> Signed-off-by: Daniel Rosenberg <drosen@...gle.com>
> ---
>  fs/unicode/utf8-core.c  | 23 ++++++++++++++++++++++-
>  include/linux/unicode.h |  3 +++
>  2 files changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
> index 2a878b739115d..90656b9980720 100644
> --- a/fs/unicode/utf8-core.c
> +++ b/fs/unicode/utf8-core.c
> @@ -6,6 +6,7 @@
>  #include <linux/parser.h>
>  #include <linux/errno.h>
>  #include <linux/unicode.h>
> +#include <linux/stringhash.h>
>  
>  #include "utf8n.h"
>  
> @@ -122,9 +123,29 @@ int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
>  	}
>  	return -EINVAL;
>  }
> -
>  EXPORT_SYMBOL(utf8_casefold);
>  
> +int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
> +		       struct qstr *str)
> +{
> +	const struct utf8data *data = utf8nfdicf(um->version);
> +	struct utf8cursor cur;
> +	int c;
> +	unsigned long hash = init_name_hash(salt);
> +
> +	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
> +		return -EINVAL;
> +
> +	while ((c = utf8byte(&cur))) {
> +		if (c < 0)
> +			return c;

Return -EINVAL here to match other unicode functions, since utf8byte
will return -1 on a binary blob, which doesn't make sense for this.

Other than that, looks good to me.

Reviewed-by: Gabriel Krisman Bertazi <krisman@...labora.com>

> +		hash = partial_name_hash((unsigned char)c, hash);
> +	}
> +	str->hash = end_name_hash(hash);
> +	return 0;
> +}
> +EXPORT_SYMBOL(utf8_casefold_hash);
> +
>  int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
>  		   unsigned char *dest, size_t dlen)
>  {
> diff --git a/include/linux/unicode.h b/include/linux/unicode.h
> index 990aa97d80496..74484d44c7554 100644
> --- a/include/linux/unicode.h
> +++ b/include/linux/unicode.h
> @@ -27,6 +27,9 @@ int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
>  int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
>  		  unsigned char *dest, size_t dlen);
>  
> +int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
> +		       struct qstr *str);
> +
>  struct unicode_map *utf8_load(const char *version);
>  void utf8_unload(struct unicode_map *um);

-- 
Gabriel Krisman Bertazi