[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wijh=SQ_9_-H6O08HgmXrWz37_vcdm55oECo+31LUs2EQ@mail.gmail.com>
Date: Sun, 27 Feb 2022 13:04:48 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Segher Boessenkool <segher@...nel.crashing.org>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@...il.com>,
David Laight <David.Laight@...lab.com>,
Arnd Bergmann <arnd@...db.de>, Jakob <jakobkoschel@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-arch <linux-arch@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Kees Cook <keescook@...omium.org>,
Mike Rapoport <rppt@...nel.org>,
"Gustavo A. R. Silva" <gustavo@...eddedor.com>,
Brian Johannesmeyer <bjohannesmeyer@...il.com>,
Cristiano Giuffrida <c.giuffrida@...nl>,
"Bos, H.J." <h.j.bos@...nl>
Subject: Re: [RFC PATCH 03/13] usb: remove the usage of the list iterator
after the loop
On Sun, Feb 27, 2022 at 12:22 PM Segher Boessenkool
<segher@...nel.crashing.org> wrote:
>
> Requiring to annotate every place that has UB (or *can* have UB!) by the
> user is even less friendly than having so much UB is already :-(
Yeah, I don't think that's the solution. In fact, I don't think that's
even practically the _issue_.
Honestly, a lot of "undefined behavior" in C is quite often of the
kind "the programmer knows what he wants".
Things like word size or byte order issues etc are classic "undefined
behavior" in the sense that the C compiler really doesn't understand
them. The C compiler won't silently fix any silly behavior you get
from writing files in native byte order, and them not working on other
platforms.
Same goes for things like memory allocators - they often need to do
things that the standard doesn't really cover, and shouldn't even
*try* to cover. It's very much a core example of where people do odd
pointer arithmetic and change the type of pointers.
The problem with the C model of "undefined behavior" is not that the
behavior ends up being architecture-specific and depending on various
in-memory (or in-register) representation of the data. No, those
things are often very much intentional (even if in the case of byte
order, the "intention" may be that the programmer simply does not
care, and "knows" that all the world is little endian).
If the C compiler just generates reliable code that can sanely be
debugged - including very much using tools that look for "hey, this
behavior can be surprising", ie all those "look for bad patterns at
run-time", then that would be 100% FINE.
But the problem with the C notion of undefined behavior is NOT that
"results can depend on memory layout and other architecture issues
that the compiler doesn't understand".
No, the problem is that the C standards people - and compiler people -
have then declared that "because this can be surprising, and the
compiler doesn't understand what is going on, now the compiler can do
something *else* entirely".
THAT is the problem.
The classic case - and my personal "beat a dead horse" - is the
completely broken type-based aliasing rules. The standards people
literally said "the compiler doesn't understand this, it can expose
byte order dependencies that shouldn't be explained, SO THE COMPILER
CAN NOW DO SOMETHING COMPLETELY INSANE INSTEAD".
And compiler people? They rushed out to do completely broken garbage -
at least some of them did.
You can literally find compiler people who see code like this (very
traditional, traditionally very valid and common, although):
// Return the least significant 16 bits of 'a' on LE machines
#define low_16_bits(x) (*(unsigned short *)&(x))
and say "oh, because that's undefined, I can now decide to not do what
the programmer told me to do AT ALL".
Note that the above wasn't actually even badly defined originally. It
was well-defined, it was used, and it was done by programmers that
knew what they were doing.
And then the C standards people decided that "because our job isn't to
describe all the architectural issues you can hit, we'll call it
undefined, and in the process let compiler people intentionally break
it".
THAT is a problem.
Undefined results are are often intentional in system software - or
they can be debugged using smart tools (including possibly very
expensive run-time code generation) that actively look for them.
But compilers that randomly do crazy things because the standard was
bad? That's just broken.
If compilers treated "undefined" as the same as
"implementation-defined, but not explicitly documented", then that
would be fine. But the C standards people - and a lot of compiler
people - really don't seem to understand the problems they caused.
And, btw, caused for no actual good reason. The HPC people who wanted
Fortran-style aliasing could easily have had that with an extension.
Yes, "restrict" is kind of a crappy one, but it could have been
improved upon. Instead, people said "let's just break the language".
Same exact thing goes for signed integer overflow.
Linus
Powered by blists - more mailing lists