linux-kernel - [PATCH] string: Improve the generic strlcpy() implementation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 5 Oct 2015 13:27:00 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Chris Metcalf <cmetcalf@...hip.com>,
	open list <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, Borislav Petkov <bp@...en8.de>
Subject: [PATCH] string: Improve the generic strlcpy() implementation


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Thu, Sep 10, 2015 at 8:43 PM, Chris Metcalf <cmetcalf@...hip.com> wrote:
> >
> > Please pull the following changes for 4.3 from:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git strscpy
> 
> So I finally pulled it. I like the patch, I like the new interface,
> but despite that I wasn't really sure if I wanted to pull it in - thus
> the long delay of me just seeing this in my list of pending pulls for
> almost a month, but never really getting to the point where I decided
> I want to commit to it.

Interesting. I noticed that strscpy() says this in its comments:

 * In addition, the implementation is robust to the string changing out
 * from underneath it, unlike the current strlcpy() implementation.

The strscpy() interface is very nice, but shouldn't we also fix this strlcpy() 
unrobustness/race it refers to, in light of the 2000+ existing strlcpy() call 
sites?

The comment does not spell out what the exact race is, but I can see only a single 
race in the current generic strlcpy() implementation, which all architectures 
except s390 uses:

size_t strlcpy(char *dest, const char *src, size_t size)
{
        size_t ret = strlen(src);

        if (size) {
                size_t len = (ret >= size) ? size - 1 : ret;
                memcpy(dest, src, len);
                dest[len] = '\0';
        }
        return ret;
}

If another CPU or an interrupt changes the source string after the strlen(), but 
before the copy is complete, and shortens the source string, then we copy over the 
NUL byte of the source buffer - including fragments of earlier source string 
tails. The target buffer will still be properly NUL terminated - but it will be a 
shorter string than the returned 'ret' source buffer length. (despite there not 
being truncation.)

The s390 arch implementation has the same race AFAICS.

This may cause bugs if the return code is subsequently used to assume that it is 
equal to the destination string's length. (While in reality it's shorter.)

The race is not automatically lethal, because it's guaranteed that the returned 
length is indeed zero-delimited (due to the overlong copy we did) - so if the 
string is memcpy()-ed, then it will still result in a weirdly padded but valid 
string.

But if any subsequent use of the return code relies on the return code being equal 
to a subsequent call of strlen(dest), then that use might lead to bugs. I.e. our 
implementation of strlcpy() is indeed racy and unrobust.

But we could fix this race: by iterating over the string in a single go and 
determining the length and copying the string at once. Like the new strscpy() code 
does it, but with strlcpy() semantics.

This will also make strlcpy() faster, FWIMBW.

I also noticed another problem with our current generic strlcpy() implementation: 
AFAICS it will also happily do bad stuff if we pass it a negative size. Instead of 
that we should print a warning and return safely.

I've implemented all that, see the patch below.

(Only very lightly tested so far and no benchmarking done.)

Thanks,

	Ingo

===================================>
>From 53fc46c16ed65e67906d5b453e19d8f688093f70 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@...nel.org>
Date: Mon, 5 Oct 2015 10:56:50 +0200
Subject: [PATCH] string: Improve the generic strlcpy() implementation

The current strlcpy() implementation has two implementational
weaknesses:

1)

There's a race:

size_t strlcpy(char *dest, const char *src, size_t size)
{
        size_t ret = strlen(src);

        if (size) {
                size_t len = (ret >= size) ? size - 1 : ret;
                memcpy(dest, src, len);
                dest[len] = '\0';
        }
        return ret;
}

If another CPU or an interrupt changes the source string after the strlen(), but
before the copy is complete, and shortens the source string, then we copy over the
NUL byte of the source buffer - including fragments of earlier source string
tails. The target buffer will still be properly NUL terminated - but it will be a
shorter string than the returned 'ret' source buffer length. (despite there not
being truncation.)

The s390 arch implementation has the same race AFAICS.

This may cause bugs if the return code is subsequently used to assume that it is
equal to the destination string's length. (While in reality it's shorter.)

The race is not automatically lethal, because it's guaranteed that the returned
length is indeed zero-delimited (due to the overlong copy we did) - so if the
string is memcpy()-ed, then it will still result in a weirdly padded but valid
string.

But if any subsequent use of the return code relies on the return code being equal
to a subsequent call of strlen(dest), then that use might lead to bugs. I.e. our
implementation of strlcpy() is indeed racy and unrobust.

But we can fix this race: by iterating over the string in a single go and
determining the length and copying the string at once. Like strscpy(), but with
strlcpy() semantics.

The new implementation uses word-by-word iteration over the strings if possible,
so this will also make strlcpy() faster as well.

2)

Another problem is that strlcpy() will also happily do bad stuff if we pass
it a negative size. Instead of that we will from now on print a (one time)
warning and return safely.

Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org
Signed-off-by: Ingo Molnar <mingo@...nel.org>
---
 lib/string.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 78 insertions(+), 8 deletions(-)

diff --git a/lib/string.c b/lib/string.c
index 8dbb7b1eab50..e0cfca299606 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -129,23 +129,93 @@ EXPORT_SYMBOL(strncpy);
  * strlcpy - Copy a C-string into a sized buffer
  * @dest: Where to copy the string to
  * @src: Where to copy the string from
- * @size: size of destination buffer
+ * @dest_size: size of destination buffer
  *
  * Compatible with *BSD: the result is always a valid
  * NUL-terminated string that fits in the buffer (unless,
  * of course, the buffer size is zero). It does not pad
  * out the result like strncpy() does.
  */
-size_t strlcpy(char *dest, const char *src, size_t size)
+size_t strlcpy(char *dest, const char *src, size_t dest_size)
 {
-	size_t ret = strlen(src);
+	const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS;
+	size_t dest_left = dest_size;
+	size_t dest_aligned_left = dest_left;
+	long src_len = 0;
+
+	/* Overflow check: */
+	if (unlikely(dest_size < 0)) {
+		WARN_ONCE(1, "strlcpy(): dest_size < 0 underflow!");
+		return strlen(src);
+	}
+
+#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+	/*
+	 * If src is unaligned, don't cross a page boundary,
+	 * since we don't know if the next page is mapped.
+	 */
+	if ((long)src & (sizeof(long) - 1)) {
+		size_t limit = PAGE_SIZE - ((long)src & (PAGE_SIZE - 1));
+		if (limit < dest_aligned_left)
+			dest_aligned_left = limit;
+	}
+#else
+	/* If src or dest is unaligned, don't do word-at-a-time. */
+	if (((long) dest | (long) src) & (sizeof(long) - 1))
+		dest_aligned_left = 0;
+#endif
+
+	/* First do the word-at-a-time copy of the aligned portion (if any): */
+	while (dest_aligned_left >= sizeof(unsigned long)) {
+		unsigned long c, data;
 
-	if (size) {
-		size_t len = (ret >= size) ? size - 1 : ret;
-		memcpy(dest, src, len);
-		dest[len] = '\0';
+		c = *(unsigned long *)(src+src_len);
+		*(unsigned long *)(dest+src_len) = c;
+
+		if (has_zero(c, &data, &constants)) {
+			data = prep_zero_mask(c, data, &constants);
+			data = create_zero_mask(data);
+			/* The target string was terminated by the above word copy */
+			return src_len + find_zero(data);
+		}
+		src_len += sizeof(unsigned long);
+		dest_left -= sizeof(unsigned long);
+		dest_aligned_left -= sizeof(unsigned long);
 	}
-	return ret;
+
+	/*
+	 * We get here either for tails smaller than word size, or
+	 * unaligned strings. Copy byte by byte and return the
+	 * length of the source string if we find its end:
+	 */
+	while (dest_left) {
+		char c;
+
+		c = src[src_len];
+		dest[src_len] = c;
+		if (!c)
+			/* The target string was terminated by the above byte copy */
+			return src_len;
+		src_len++;
+		dest_left--;
+	}
+
+	/*
+	 * We get here if the source string is larger than the destination buffer.
+	 *
+	 * The strlcpy() semantics require us to return the length of the
+	 * source string - so we have to continue until we find its end.
+	 *
+	 * We first zero-terminate the (truncated, hence non yet terminated)
+	 * target string.
+	 */
+	if (dest_size)
+		dest[dest_size-1] = '\0';
+
+	while (src[src_len])
+		src_len++;
+
+	return src_len;
 }
 EXPORT_SYMBOL(strlcpy);
 #endif
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/