linux-kernel - Re: [PATCH v2 1/2] string: Add load_unaligned_zeropad() code path to sized

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <202503181957.55A0E0A@keescook>
Date: Tue, 18 Mar 2025 20:06:28 -0700
From: Kees Cook <kees@...nel.org>
To: Peter Collingbourne <pcc@...gle.com>
Cc: Alexander Viro <viro@...iv.linux.org.uk>,
	Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andy Shevchenko <andy@...nel.org>,
	Andrey Konovalov <andreyknvl@...il.com>,
	Catalin Marinas <catalin.marinas@....com>,
	Mark Rutland <mark.rutland@....com>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-hardening@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org, stable@...r.kernel.org
Subject: Re: [PATCH v2 1/2] string: Add load_unaligned_zeropad() code path to
 sized_strscpy()

On Tue, Mar 18, 2025 at 02:40:32PM -0700, Peter Collingbourne wrote:
> The call to read_word_at_a_time() in sized_strscpy() is problematic
> with MTE because it may trigger a tag check fault when reading
> across a tag granule (16 bytes) boundary. To make this code
> MTE compatible, let's start using load_unaligned_zeropad()
> on architectures where it is available (i.e. architectures that
> define CONFIG_DCACHE_WORD_ACCESS). Because load_unaligned_zeropad()
> takes care of page boundaries as well as tag granule boundaries,
> also disable the code preventing crossing page boundaries when using
> load_unaligned_zeropad().
> 
> Signed-off-by: Peter Collingbourne <pcc@...gle.com>
> Link: https://linux-review.googlesource.com/id/If4b22e43b5a4ca49726b4bf98ada827fdf755548
> Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS")
> Cc: stable@...r.kernel.org
> ---
> v2:
> - new approach
> 
>  lib/string.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/string.c b/lib/string.c
> index eb4486ed40d25..b632c71df1a50 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -119,6 +119,7 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count)
>  	if (count == 0 || WARN_ON_ONCE(count > INT_MAX))
>  		return -E2BIG;
>  
> +#ifndef CONFIG_DCACHE_WORD_ACCESS
>  #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS

I would prefer this were written as:

#if !defined(CONFIG_DCACHE_WORD_ACCESS) && \
    defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)

Having 2 #ifs makes me think there is some reason for having them
separable. But the logic here is for a single check.

>  	/*
>  	 * If src is unaligned, don't cross a page boundary,
> @@ -133,12 +134,14 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count)
>  	/* If src or dest is unaligned, don't do word-at-a-time. */
>  	if (((long) dest | (long) src) & (sizeof(long) - 1))
>  		max = 0;
> +#endif
>  #endif

(Then no second #endif needed)

>  
>  	/*
> -	 * read_word_at_a_time() below may read uninitialized bytes after the
> -	 * trailing zero and use them in comparisons. Disable this optimization
> -	 * under KMSAN to prevent false positive reports.
> +	 * load_unaligned_zeropad() or read_word_at_a_time() below may read
> +	 * uninitialized bytes after the trailing zero and use them in
> +	 * comparisons. Disable this optimization under KMSAN to prevent
> +	 * false positive reports.
>  	 */
>  	if (IS_ENABLED(CONFIG_KMSAN))
>  		max = 0;
> @@ -146,7 +149,11 @@ ssize_t sized_strscpy(char *dest, const char *src, size_t count)
>  	while (max >= sizeof(unsigned long)) {
>  		unsigned long c, data;
>  
> +#ifdef CONFIG_DCACHE_WORD_ACCESS
> +		c = load_unaligned_zeropad(src+res);
> +#else
>  		c = read_word_at_a_time(src+res);
> +#endif
>  		if (has_zero(c, &data, &constants)) {
>  			data = prep_zero_mask(c, data, &constants);
>  			data = create_zero_mask(data);

The rest seems good. Though I do wonder: what happens on a page boundary
for read_word_at_a_time(), then? We get back zero-filled remainder? Will
that hide a missing NUL terminator? As in, it's not actually there
because of the end of the page/granule, but a zero was put in, so now
it looks like it's been terminated and the exception got eaten? And
doesn't this hide MTE faults since we can't differentiate "overran MTE
tag" from "overran granule while over-reading"?

-- 
Kees Cook