lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <s6fgd3jzckxv3lrylph5cl4l3ykk7qdrnjwlvwe3pxxc2zgch5@2jgoh5lkoh5b>
Date: Fri, 26 Sep 2025 10:29:52 +0200
From: Alejandro Colomar <alx@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Kees Cook <kees@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>, 
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, Christopher Bazley <chris.bazley.wg14@...il.com>, 
	Rasmus Villemoes <linux@...musvillemoes.dk>, Marco Elver <elver@...gle.com>, Michal Hocko <mhocko@...e.com>, 
	Al Viro <viro@...iv.linux.org.uk>, Alexander Potapenko <glider@...gle.com>, 
	Dmitry Vyukov <dvyukov@...gle.com>, Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH v1 0/3] Add ENDOF(), and use it to fix off-by-one bugs

Hi Linus, Kees, Andrew,

On Thu, Sep 25, 2025 at 05:00:30PM -0700, Kees Cook wrote:
> On Thu, Sep 25, 2025 at 01:48:14PM -0700, Andrew Morton wrote:
> > Patchset seems reasonable, I guess.  But I'm not loving "ENDOF".  End
> > of what - is that like __etext?  "ARRAY_END" matches "ARRAY_SIZE" quite
> > nicely, no?  And it much better describes what the thing does.
>
> And it's really ARRAY_BEYOND. ;) I don't really like having APIs that
> require holding pointers that are actively invalid, either.

The name END has some conventions.  It's not arbitrarily chosen.

The C standard is quite consistent in referring to the last addressable
element of an array as the LAST element.

For what we're dealing with, which is a pointer to one after that thing,
the C standard is also consistent in calling it 'one past the last
element'.  It doesn't have a single-word term, though.

However, C++ does have a single term: 'std::end'.

And many projects, when having to come up with a single-word term for
this thing, natural evolution shows it converges to 'end'.  Even in the
kernel, as your local pointer variables are actually called 'end' and
not 'beyond'.

> u8 array[10];
> u8 *first = array; // valid address
> u8 *last = &array[ARRAY_SIZE(array) - 1]; // valid address

This last is correctly used, as a term, but quite dangerous, as Linus
points out.

LASTOF() would be something quite more dangerous than ENDOF().  First of
all, because it's not guaranteed to exist, as Linus points out.
LASTOF() applied to a zero-length array would be Undefined Behavior
straight away.  That's indeed why we have guarantees that the end always
exists and can be held.

>
> for (u8 *c = first; c <= last; c++)

Secondly, loops with <= are weird and error prone.

>   putc(*c);
> // c would now be invalid but has left scope so it cannot be used
>
> --
> Kees Cook


On Thu, Sep 25, 2025 at 07:36:13PM -0700, Linus Torvalds wrote:
> On Thu, 25 Sept 2025 at 18:31, Kees Cook <kees@...nel.org> wrote:
> >
> > I can have an opinion about the relative safety of holding pointers that
> > can't be safely dereferenced, though. :) But yes, I've long since lost
> > the argument that C should avoid these kinds of past-the-end tokens.
> 
> The thing is, the "start+len" model is actually *safer* than the
> "start+len-1" model.
> 
> It's safer because it's simpler and doesn't involve the whole
> "subtract one" (and then when you have the "past the end" it's also
> easy to calculate "how much is left").
> 
> So it's simpler, and it's explicitly supported by the standard.
> 
> It's also more strictly correct, because "start+len-1" is technically
> not a valid pointer at all. The C standard makes the "past the end"
> case explicitly valid, but not the "before the beginning".
> 
> Now, I'm the last person to say that the C standard is always correct
> - there's a lot of garbage in there, but in this case it also happens
> to match the notion of "this works".
> 
> Because the "pointer to the last byte" model DOES NOT WORK.

So far, fully agree.

> In C, NULL is actually a valid pointer for zero-sized elements.

This is only valid since very recently.

Remember that 'NULL - NULL' is (or was) Undefined Behavior.  That
operation doesn't result in 0, per the standard.  NULL < NULL is (or
was) also Undefined Behavior.

Consider this innocent loop:

	end = ptr + size; 

	for (p = ptr; p < end; p++)
		*p = 0;

If ptr were NULL (trying to represent a zero-size array), we'd at least
incur in one NULL < NULL operation.

However, a unique pointer, as returned by glibc malloc(0), would be
a valid pointer, and p<p would be perfectly valid.  p-p would also
perfectly result in 0.  Of course you can't access p[i] for any i>=0,
but that's true of any object.  If you have an array of 2, you can't
access p[i] for any i>=2.  Everything is consistent with unique
pointers.

So, the whole model of malloc(0) returning NULL as successfully
allocating 0 bytes is broken.

Recently, the committee decided that memcpy(3) would accept NULL if the
size is 0, and I'm not sure if they also defined the behavior for any of
NULL < NULL or NULL - NULL.  I'd still be careful using NULL to
represent a zero-size array.

> Yes, really.
> 
> The C standard says that "malloc(0)" can return NULL, without it being
> an error. So the tuple "NULL, 0" is actually a perfectly valid
> "pointer and length" pair, and one that you may actually get thanks to
> how malloc() works.

This is something I want to change.  malloc(0) returning NULL is
terrible.  realloc(p,0) returning NULL is even more terrible.  And by
terrible, I mean it has caused remote-code-execution exploits.

<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3621.txt>

> Obviously you cannot dereference a zero-sized object, but zero-sized
> objects aren't "wrong" per se.
> 
> Now, I happen to believe that the "return NULL for zero sized
> allocations" it's not a great model (it also makes error checking more
> complicated). So the kernel kmalloc() function actually returns a
> different pointer that cannot be dereferenced (grep for
> ZERO_SIZE_PTR).
> 
> But my point is that "ptr+len" actually *works* for that case.
> 
> The "ptr+len-1" loop thing you gave as an example not only is invalid
> C for zero-sized cases, it simply does not even work at all for the
> NULL case.
> 
> Your example loop of how things should work IS WRONG.
> 
> Yes, it happens to work as long as you never have zero-sized things,
> but it really is a bad pattern.
> 
> This is not "opinion". This is just a plain fact about how C works.
> "start+len" is well-defined. "start+len-1" is NOT.

Agree.


Have a lovely day!
Alex

> 
>                Linus

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ