[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <krmt6a25gio6ing5mgahl72nvw36jc7u3zyyb5dzbk4nfjnuy4@fex2h7lqmfwt>
Date: Fri, 11 Jul 2025 01:23:49 +0200
From: Alejandro Colomar <alx@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-mm@...ck.org, linux-hardening@...r.kernel.org,
Kees Cook <kees@...nel.org>, Christopher Bazley <chris.bazley.wg14@...il.com>,
shadow <~hallyn/shadow@...ts.sr.ht>, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>, kasan-dev@...glegroups.com, Dmitry Vyukov <dvyukov@...gle.com>,
Alexander Potapenko <glider@...gle.com>, Marco Elver <elver@...gle.com>, Christoph Lameter <cl@...ux.com>,
David Rientjes <rientjes@...gle.com>, Vlastimil Babka <vbabka@...e.cz>,
Roman Gushchin <roman.gushchin@...ux.dev>, Harry Yoo <harry.yoo@...cle.com>,
Andrew Clayton <andrew@...ital-domain.net>, Rasmus Villemoes <linux@...musvillemoes.dk>,
Michal Hocko <mhocko@...e.com>, Al Viro <viro@...iv.linux.org.uk>,
Martin Uecker <uecker@...raz.at>, Sam James <sam@...too.org>, Andrew Pinski <pinskia@...il.com>
Subject: Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()
Hi Linus,
[I'll reply to both of your emails at once]
On Thu, Jul 10, 2025 at 02:58:24PM -0700, Linus Torvalds wrote:
> You took my suggestion, and then you messed it up.
>
> Your version of sprintf_array() is broken. It evaluates 'a' twice.
> Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> argument.
An array has no issue being evaluated twice (unless it's a VLA). On the
other hand, I agree it's better to not do that in the first place.
My bad for forgetting about it. Sorry.
On Thu, Jul 10, 2025 at 03:08:29PM -0700, Linus Torvalds wrote:
> If you want to return an error on truncation, do it right. Not by
> returning NULL, but by actually returning an error.
Okay.
> For example, in the kernel, we finally fixed 'strcpy()'. After about a
> million different versions of 'copy a string' where every single
> version was complete garbage, we ended up with 'strscpy()'. Yeah, the
> name isn't lovely, but the *use* of it is:
I have implemented the same thing in shadow, called strtcpy() (T for
truncation). (With the difference that we read the string twice, since
we don't care about threads.)
I also plan to propose standardization of that one in ISO C.
> - it returns the length of the result for people who want it - which
> is by far the most common thing people want
Agree.
> - it returns an actual honest-to-goodness error code if something
> overflowed, instead of the absoilutely horrible "source length" of the
> string that strlcpy() does and which is fundamentally broken (because
> it requires that you walk *past* the end of the source,
> Christ-on-a-stick what a broken interface)
Agree.
> - it can take an array as an argument (without the need for another
> name - see my earlier argument about not making up new names by just
> having generics)
We can't make the same thing with sprintf() variants because they're
variadic, so you can't count the number of arguments. And since the
'end' argument is of the same type as the formatted string, we can't
do it with _Generic reliably either.
> Now, it has nasty naming (exactly the kind of 'add random character'
> naming that I was arguing against), and that comes from so many
> different broken versions until we hit on something that works.
>
> strncpy is horrible garbage. strlcpy is even worse. strscpy actually
> works and so far hasn't caused issues (there's a 'pad' version for the
> very rare situation where you want 'strncpy-like' padding, but it
> still guarantees NUL-termination, and still has a good return value).
Agree.
> Let's agree to *not* make horrible garbage when making up new versions
> of sprintf.
Agree. I indeed introduced the mistake accidentally in v4, after you
complained of having too many functions, as I was introducing not one
but two APIs: seprintf() and stprintf(), where seprintf() is what now
we're calling sprintf_end(), and stprintf() we could call it
sprintf_trunc(). So I did the mistake by trying to reduce the number of
functions to just one, which is wrong.
So, maybe I should go back to those functions, and just give them good
names.
What do you think of the following?
#define sprintf_array(a, ...) sprintf_trunc(a, ARRAY_SIZE(a), __VA_ARGS__)
#define vsprintf_array(a, ap) vsprintf_trunc(a, ARRAY_SIZE(a), ap)
char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
int sprintf_trunc(char *buf, size_t size, const char *fmt, ...);
int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args);
char *sprintf_end(char *p, const char end[0], const char *fmt, ...)
{
va_list args;
va_start(args, fmt);
p = vseprintf(p, end, fmt, args);
va_end(args);
return p;
}
char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
{
int len;
if (unlikely(p == NULL))
return NULL;
len = vsprintf_trunc(p, end - p, fmt, args);
if (unlikely(len < 0))
return NULL;
return p + len;
}
int sprintf_trunc(char *buf, size_t size, const char *fmt, ...)
{
va_list args;
int len;
va_start(args, fmt);
len = vstprintf(buf, size, fmt, args);
va_end(args);
return len;
}
int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args)
{
int len;
if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
return -EOVERFLOW;
len = vsnprintf(buf, size, fmt, args);
if (unlikely(len >= size))
return -E2BIG;
return len;
}
sprintf_trunc() is like strscpy(), but with a formatted string. It
could replace uses of s[c]nprintf() where there's a single call (no
chained calls).
sprintf_array() is like the 2-argument version of strscpy(). It could
replace s[c]nprintf() calls where there's no chained calls, where the
input is an array.
sprintf_end() would replace the chained calls.
Does this sound good to you?
Cheers,
Alex
--
<https://www.alejandro-colomar.es/>
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists