[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1a27385c-cca6-888b-1125-d6383e48c0f5@prevas.dk>
Date: Wed, 21 Dec 2022 16:29:11 +0100
From: Rasmus Villemoes <rasmus.villemoes@...vas.dk>
To: Geert Uytterhoeven <geert@...ux-m68k.org>,
Guenter Roeck <linux@...ck-us.net>
Cc: "Jason A. Donenfeld" <Jason@...c4.com>,
linux-kernel@...r.kernel.org, linux-kbuild@...r.kernel.org,
linux-arch@...r.kernel.org, linux-toolchains@...r.kernel.org,
Masahiro Yamada <masahiroy@...nel.org>,
Kees Cook <keescook@...omium.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
linux-m68k@...ts.linux-m68k.org
Subject: Re: [PATCH v2] kbuild: treat char as always unsigned
On 21/12/2022 16.05, Geert Uytterhoeven wrote:
> Hi Günter,
>
> On Wed, Dec 21, 2022 at 3:54 PM Guenter Roeck <linux@...ck-us.net> wrote:
>> On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote:
>>> Recently, some compile-time checking I added to the clamp_t family of
>>> functions triggered a build error when a poorly written driver was
>>> compiled on ARM, because the driver assumed that the naked `char` type
>>> is signed, but ARM treats it as unsigned, and the C standard says it's
>>> architecture-dependent.
>>>
>>> I doubt this particular driver is the only instance in which
>>> unsuspecting authors make assumptions about `char` with no `signed` or
>>> `unsigned` specifier. We were lucky enough this time that that driver
>>> used `clamp_t(char, negative_value, positive_value)`, so the new
>>> checking code found it, and I've sent a patch to fix it, but there are
>>> likely other places lurking that won't be so easily unearthed.
>>>
>>> So let's just eliminate this particular variety of heisensign bugs
>>> entirely. Set `-funsigned-char` globally, so that gcc makes the type
>>> unsigned on all architectures.
>>>
>>> This will break things in some places and fix things in others, so this
>>> will likely cause a bit of churn while reconciling the type misuse.
>>>
>>
>> There is an interesting fallout: When running the m68k:q800 qemu emulation,
>> there are lots of warning backtraces.
>>
>> WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
>> testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha12,aes)' before 'adiantum(xchacha20,aes)'
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
>> testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha20,aes)' before 'aegis128'
>>
>> and so on for pretty much every entry in the alg_test_descs[] array.
>>
>> Bisect points to this patch, and reverting it fixes the problem.
>>
>> It looks like the problem is that arch/m68k/include/asm/string.h
>> uses "char res" to store the result of strcmp(), and char is now
>> unsigned - meaning strcmp() will now never return a value < 0.
>> Effectively that means that strcmp() is broken on m68k if
>> CONFIG_COLDFIRE=n.
>>
>> The fix is probably quite simple.
>>
>> diff --git a/arch/m68k/include/asm/string.h b/arch/m68k/include/asm/string.h
>> index f759d944c449..b8f4ae19e8f6 100644
>> --- a/arch/m68k/include/asm/string.h
>> +++ b/arch/m68k/include/asm/string.h
>> @@ -42,7 +42,7 @@ static inline char *strncpy(char *dest, const char *src, size_t n)
>> #define __HAVE_ARCH_STRCMP
>> static inline int strcmp(const char *cs, const char *ct)
>> {
>> - char res;
>> + signed char res;
>>
>> asm ("\n"
>> "1: move.b (%0)+,%2\n" /* get *cs */
>>
>> Does that make sense ? If so I can send a patch.
>
> Thanks, been there, done that
> https://lore.kernel.org/all/bce014e60d7b1a3d1c60009fc3572e2f72591f21.1671110959.git.geert@linux-m68k.org
Well, looks like that would still leave strcmp() buggy, you can't
represent all possible differences between two char values (signed or
not) in an 8-bit quantity. So any implementation based on returning the
first non-zero value of *a - *b must store that intermediate value in
something wider. Otherwise you'll get -128 from strcmp("\x40", "\xc0"),
but _also_ -128 when you do strcmp("\xc0", "\x40"), which is obviously
bogus.
I recently fixed that long-standing bug in U-Boot's strcmp() and a
similar one in nolibc in the linux tree. I wonder how many more
instances exist.
Rasmus
Powered by blists - more mailing lists