linux-kernel - Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <52c52665-35f5-4111-a9d4-f8669c79bf70@sdfg.com.ar>
Date: Tue, 13 Feb 2024 00:16:06 +0100
From: Rodrigo Campos <rodrigo@...g.com.ar>
To: Willy Tarreau <w@....eu>
Cc: Thomas Weißschuh <linux@...ssschuh.net>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size
 usage

On 2/11/24 11:48, Willy Tarreau wrote:
> Hi Rodrigo,
> 
> first, thanks for the series!

Thank you, for your time and review! :)

> On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote:
>> The return code should always be strlen(src) + strlen(dst), but dst is
>> considered shorter if size is less than strlen(dst).
>>
>> While we are there, make sure to copy at most size-1 bytes and
>> null-terminate the dst buffer.
>>
>> Signed-off-by: Rodrigo Campos <rodrigo@...g.com.ar>
>> ---
>>   tools/include/nolibc/string.h | 14 +++++++-------
>>   1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
>> index ed15c22b1b2a..b2149e1342a8 100644
>> --- a/tools/include/nolibc/string.h
>> +++ b/tools/include/nolibc/string.h
>> @@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen)
>>   static __attribute__((unused))
>>   size_t strlcat(char *dst, const char *src, size_t size)
>>   {
>> -	size_t len;
>>   	char c;
>> +	size_t len = strlen(dst);
>> +	size_t ret = strlen(src) + (size < len? size: len);
> 
>  From what I'm reading in the man page, ret should *always* be the sum
> of the two string lengths. I guess it helps for reallocation. It's even
> explicitly mentioned:
> 
>    "While this may seem somewhat confusing, it was done to make truncation
>     detection simple."

Yes, that was my *first* understanding of the manpage too. But it 
doesn't seem to be the correct interpretation.

Also, this is exactly what I tried to say in the cover letter, with:

	I thought the manpage was clear, but when checking against that,
	I noted a few twists (like the manpage says the return code of
	strlcat is strlen(src) + strlen(dst), but it was not clear it is
	not that if size < strlen(dst). When looking at the libbsd
	implementation and re-reading the manpage, I understood what it
	really meant).

Sorry if I wasn't clear. Let me try again.

My first interpretation of the manpage was also that, and I think it 
would make sense to be that one. However, it is wrong, they also say 
this, that is what made me add the ternary operator:

	Note, however, that if strlcat() traverses size characters
	without finding a NUL, the length of the string is considered
	to be  size and the destination string will not be NUL
	terminated (since there was no space for the NUL)

So, my interpretation is that it is the sum of both, except when we 
can't find the NUL in that size, in which case we conside "size" to be 
the len of dst.

If you compare it with the output of libbsd, the return code seems to be 
exactly that. I was surprised too, as the manpage seem so clear... :-/

> Above ret will be bound to the existing size so a realloc wouldn't work.
> Thus I think the correct solution is:

Note that this implementation fails the tests added on patch 4. I've 
tested them (output and return code) to match the libbsd implementation.

> The test inside the loop is going to make this not very efficient. Same
> for the fact that we're measuring the length of src twice (once via
> strlen, a second time through the loop). I've just had a look and it
> compiles to 77 bytes at -Os. A simpler variant would consist in trying

How are you measuring the size?

I've added noinline to strlcat to the version I sent, so now it is shown 
in nm, but I have this as output:

$ nm --size -t x test.o
	0000000000000004 V errno
	0000000000000006 t strlcat.constprop.0
	0000000000000008 V _auxv
	0000000000000008 V environ
	000000000000000e W strlen
	000000000000000f W _start
	0000000000000018 W raise
	000000000000001b W abort
	000000000000004c T main
	000000000000005a t u64toa_r.isra.0
	0000000000000095 W _start_c
	00000000000002a8 t printf

How are you measuring it there?

Sorry, I'm not familiar with this :)

> to copy what fits in <size> and once reached, go on just for trailing
> zero and the size measurement:
> 
> size_t strlcat(char *dst, const char *src, size_t size)
> {
>          size_t len = strlen(dst);

The thing is, we need to return always at least strlen(src). Maybe plus 
something else. Even if size==0 and we don't copy anything.

Maybe we can special case that, so we simplify the loop, and it's smaller?

I've been playing, but as I can't measure the size, I'm not sure what is 
really better. I'd like to play a little once I know how to measure it :)

Best,
Rodrigo