[<prev] [next>] [day] [month] [year] [list]
Message-ID: <d535f0b5-7319-4a21-a002-eb4074758c22@gmail.com>
Date: Fri, 12 Jan 2024 22:58:41 +0100
From: Michael de Lang <kingoipo@...il.com>
To: "Enrico Weigelt, metux IT consult" <info@...ux.net>,
"H. Peter Anvin" <hpa@...or.com>, David Howells <dhowells@...hat.com>,
linux-kernel@...r.kernel.org, pinskia@...il.com
Subject: Re: [PATCH 00/45] C++: Convert the kernel to C++
Thanks for your reply.
>> Namely, to prevent stagnation for the Kernel as well as continue to be
>> interesting to new developers.
>
> Which stagnation are you talking about, exactly ?
While I do not know what Linus was exactly thinking about when he
mentioned stagnation, I assume he was looking at it from the lens of
long-term maintainers. I'm basing this on the 2021 discussion on lwn:
https://lwn.net/Articles/870581/. Obviously there are plenty of
contributors every kernel release and while I don't have any numbers
there, I don't think # of contributors or # of contributions is an issue.
Still, the idea of C discouraging people to contribute resonates with
me. That is largely based on subjectivity so feel free to ignore it.
>
> While I've got a long list of ideas for modernizing the kernel
> (which I'm lacking time to actually work on), I'm unsure whether
> C++ really would be of much benefit. Especially considering that for
> many things there's no way to know / define how things will really
> look like on binary level.
Do you have any examples on what exactly in C++ obfuscates the resulting
binary? Everything I can think of, also applies to C: anything
implementation-defined, e.g. struct layout, high-order bit propagation
for shift operations,
There are things in the STL that are implementation defined, but the
proposal excludes the STL.
> Personally, the opposite had been one my primary reasons.
> Because it's so simple to understand - in contrast to the usual C++
> monster's i've seen so often in the wild. (I usually try to keep far
> away from C++ projects).
I have never understood the sentiment that C is supposedly simple.
Looking at the macros used in the kernel is one obvious big argument
against using C, as macros can be considered their own
language-inside-a-language. Another big argument against the sentiment
is the loose type system, where void* casts are everywhere you want to
do anything remotely type-generic, losing type information and making it
harder to grok the original intent.
Creating a compiler for C is 'easier' than creating one for C++ (or Rust
for that matter), but coding in it as a user requires years of
experience to avoid a lot of the pitfalls. A simple language would be
something like golang, with its GC and prescribed coding patterns.
C is a language to be (ab)used like any other, the same goes for C++.
The kernel has shown that it is possible to create maintainable C, I
feel confident saying that it is also possible in C++.
> Note that C++ is a very complex language,
> and w/ STL it's even much, much more complex.
Note that the proposal here is to use C++ without the STL as well as
apply some other restrictions.
> Can't judge what you see as interesting, but frankly, I really don't
> have it on my list of interesting things - instead would prefer phasing
> C++ out in favour of many other languages.
I could give you concrete examples of C++ language addition examples,
but I don't think that adds much to the discussion. Many languages,
including C++, have additions that C does not have and provide benefits
such as reduced cognitive load, standardised ways to do things
preventing NIH syndrome and possibly enthuse more people to contribute
to the kernel.
The biggest merit of using C++ in the kernel is that in comparison to
other systems language (Zig, Rust, Swift to name a few) it requires the
least re-skilling of existing contributors. A close second would be the
low barrier to integrate various C++ and C codebases. Especially when
taking into account the architectures that the kernel needs to support
vs the other languages. Even Rust with its big push towards being a
replacement isn't there yet today (e.g. PA-RISC).
>
>> other languages, unlike C. The aforementioned metaprogramming is one
>
> Metaprogramming can be very interesting indeed - Oberon once made a
> really good show case, but I wouldn't dare trying that in kernel space.
> And it's hard to do that w/o causing extra performance penalties.
I believe this is a case of having to try it first before being able to
decisively say anything about the impact. Counter-examples have been
mentioned elsewhere in the thread.
>
>> such example, but things like RAII, smart pointers and things like
>> gsl::not_null would reduce the changes on kernel bugs, especially
>> memory safety related bugs that are known to be vulnerable to security
>> issues.
>
> These are exactly the things I would prefer keeping out of kernel space.
> Indeed there're several areas where it could be nice, but there're
> others where we really can't take it.
As you mention yourself, there are places where such constructs would be
a boon and places where we should not apply them. I have faith in the
Kernel processes to weed out using things where they should not, as is
presumably done already for certain C constructs today.
>
>> On the other hand, the benefits I mention can also turn into
>> downsides: if constructs like gsl::not_null are desired, does that
>> mean that there
>
> this seems to be pretty much an assert() - obviously something we really
> cannot have in the kernel.
gsl::not_null prevents constructing a pointer with NULL, ensuring at
compile-time that it never happens. As such, an assert() would be
superfluous. It is exactly an example of a C++ construct that has no
downsides and only upsides.
>
>> will be a kernel-specific template library? A KTL instead of STL? That
>> might be yet another thing that increases the steepness of the kernel
>> development learning curve.
>
> Most likely we'd need our own kernel specific library. (we also have one
> instead of libc). Some simple pieces might look similar to STL on the
> front, but it would have to be very different from userland.
>
> At that point, your previous argument about attracting more people
> who're already used to / like C++ breaks down, because it wouldn't be
> that C++ as usual C++ devs know it (IIRC, STL is integral part of the
> standard), but just the core lang plus some very custom template lib.
It's not that the argument breaks down, it's that it applies to a
smaller, but still greater than 0, target audience. There are plenty of
C++ programmers out there that disable the STL on purpose: game
developers, automotive engineers that I know and so on. You're going to
be hard-pressed to find concrete numbers, but the fact that the EASTL
and ETL exist shows the proliferation of non-STL C++ and that the STL
itself is not an integral part of C++. I recommend you check out ETL
specifically, I'm sure you'll be amazed at how much functionality it
has, especially geared for the embedded world.
>
>> Although compiler-specific, C++20 has enabled implementing RTTI
>> without RTTI as well as (partial) reflection.
>
> You name it: compiler specific.
>
> Is it even specified how this exactly looks at binary level, and methods
> to control the exact binary data structures ?
>
> The least thing's need to implement such things is some pointer or tag
> inside each struct/object instance - this would change struct layouts!
> Note that we often use structs to reflect HW specific data structures,
> so we'd need a way to have exact control over this. And then we need to
> be very careful on which instances have RTTI and which ones don't.
> I see debugging nightmares on the horizon ...
I could be convinced that RTTI of any sort is just a bad idea in the
kernel. It is one of the things that is first to be disabled in embedded
C++ usage, alongside exceptions. Still, it has its uses even in those
areas, but that's outside of the scope of this proposal I think.
>
>> On top of increasing the binary size,
>
> That's also a huge problem:
>
> Templates in general have the strong tendency of producing lots of
> duplicated code. That's what they're designed for: expressing similar
> things (that have to be different on binary level) by the same
> generic source code.
>
> It might be possible to write them in a way they don't increase binary
> size, but that's not entirely trivial, and so the actual gain of all
> of that becomes questionable again.
Hmm, explicit template instantiations are an 'easy' fix to taming the
code bloat, but any use of templates is going to mean _some_ extra code
generation. I do not have any concrete Kernel examples here, but I'm
sure there are switch/case statements somewhere in there that can be
optimized away by using templates. For those, the question is: code
bloat or run-time performance?
>
>> this then becomes a discussion on what requirements the kernel puts on
>> compilers, as I'm sure that the kernel needs to be compiled for
>> architectures which have a less than stellar conformance to the C++
>> specification.
> Indeed. Also think about embedded environments, where folks can't easily
> upgrade toolchains (e.g. due regulative constraints)
>
This argument also applies against using Rust and is directly opposed to
modern security practices. Updating to the latest version for
OS/compilers/libraries etc is pretty much a given since UN R155 and UN
R156 came into effect. Though those apply only to automotive so far, the
Cyber Resilience Act is going to force manufacturers of all kinds to
adhere to better security. There is definitely a whole debate we can
have just on the impacts of these regulations and what that should mean,
but I've already written a lot ;)
Cheers,
Michael de Lang
Powered by blists - more mailing lists