lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <d535f0b5-7319-4a21-a002-eb4074758c22@gmail.com>
Date: Fri, 12 Jan 2024 22:58:41 +0100
From: Michael de Lang <kingoipo@...il.com>
To: "Enrico Weigelt, metux IT consult" <info@...ux.net>,
 "H. Peter Anvin" <hpa@...or.com>, David Howells <dhowells@...hat.com>,
 linux-kernel@...r.kernel.org, pinskia@...il.com
Subject: Re: [PATCH 00/45] C++: Convert the kernel to C++

Thanks for your reply.

>> Namely, to prevent stagnation for the Kernel as well as continue to be 
>> interesting to new developers.
> 
> Which stagnation are you talking about, exactly ?

While I do not know what Linus was exactly thinking about when he 
mentioned stagnation, I assume he was looking at it from the lens of 
long-term maintainers. I'm basing this on the 2021 discussion on lwn: 
https://lwn.net/Articles/870581/. Obviously there are plenty of 
contributors every kernel release and while I don't have any numbers 
there, I don't think # of contributors or # of contributions is an issue.

Still, the idea of C discouraging people to contribute resonates with 
me. That is largely based on subjectivity so feel free to ignore it.

> 
> While I've got a long list of ideas for modernizing the kernel
> (which I'm lacking time to actually work on), I'm unsure whether
> C++ really would be of much benefit. Especially considering that for
> many things there's no way to know / define how things will really
> look like on binary level.

Do you have any examples on what exactly in C++ obfuscates the resulting 
binary? Everything I can think of, also applies to C: anything 
implementation-defined, e.g. struct layout, high-order bit propagation 
for shift operations,

There are things in the STL that are implementation defined, but the 
proposal excludes the STL.

> Personally, the opposite had been one my primary reasons.
> Because it's so simple to understand - in contrast to the usual C++
> monster's i've seen so often in the wild. (I usually try to keep far
> away from C++ projects). 

I have never understood the sentiment that C is supposedly simple. 
Looking at the macros used in the kernel is one obvious big argument 
against using C, as macros can be considered their own 
language-inside-a-language. Another big argument against the sentiment 
is the loose type system, where void* casts are everywhere you want to 
do anything remotely type-generic, losing type information and making it 
harder to grok the original intent.

Creating a compiler for C is 'easier' than creating one for C++ (or Rust 
for that matter), but coding in it as a user requires years of 
experience to avoid a lot of the pitfalls. A simple language would be 
something like golang, with its GC and prescribed coding patterns.

C is a language to be (ab)used like any other, the same goes for C++. 
The kernel has shown that it is possible to create maintainable C, I 
feel confident saying that it is also possible in C++.

 > Note that C++ is a very complex language,
 > and w/ STL it's even much, much more complex.

Note that the proposal here is to use C++ without the STL as well as 
apply some other restrictions.

> Can't judge what you see as interesting, but frankly, I really don't
> have it on my list of interesting things - instead would prefer phasing
> C++ out in favour of many other languages.

I could give you concrete examples of C++ language addition examples, 
but I don't think that adds much to the discussion. Many languages, 
including C++, have additions that C does not have and provide benefits 
such as reduced cognitive load, standardised ways to do things 
preventing NIH syndrome and possibly enthuse more people to contribute 
to the kernel.

The biggest merit of using C++ in the kernel is that in comparison to 
other systems language (Zig, Rust, Swift to name a few) it requires the 
least re-skilling of existing contributors. A close second would be the 
low barrier to integrate various C++ and C codebases. Especially when 
taking into account the architectures that the kernel needs to support 
vs the other languages. Even Rust with its big push towards being a 
replacement isn't there yet today (e.g. PA-RISC).

> 
>> other languages, unlike C. The aforementioned metaprogramming is one 
> 
> Metaprogramming can be very interesting indeed - Oberon once made a 
> really good show case, but I wouldn't dare trying that in kernel space.
> And it's hard to do that w/o causing extra performance penalties.

I believe this is a case of having to try it first before being able to 
decisively say anything about the impact. Counter-examples have been 
mentioned elsewhere in the thread.

> 
>> such example, but things like RAII, smart pointers and things like 
>> gsl::not_null would reduce the changes on kernel bugs, especially 
>> memory safety related bugs that are known to be vulnerable to security 
>> issues.
> 
> These are exactly the things I would prefer keeping out of kernel space.
> Indeed there're several areas where it could be nice, but there're
> others where we really can't take it.

As you mention yourself, there are places where such constructs would be 
a boon and places where we should not apply them. I have faith in the 
Kernel processes to weed out using things where they should not, as is 
presumably done already for certain C constructs today.

> 
>> On the other hand, the benefits I mention can also turn into 
>> downsides: if constructs like gsl::not_null are desired, does that 
>> mean that there 
> 
> this seems to be pretty much an assert() - obviously something we really
> cannot have in the kernel.

gsl::not_null prevents constructing a pointer with NULL, ensuring at 
compile-time that it never happens. As such, an assert() would be 
superfluous. It is exactly an example of a C++ construct that has no 
downsides and only upsides.

> 
>> will be a kernel-specific template library? A KTL instead of STL? That 
>> might be yet another thing that increases the steepness of the kernel 
>> development learning curve.
> 
> Most likely we'd need our own kernel specific library. (we also have one
> instead of libc). Some simple pieces might look similar to STL on the
> front, but it would have to be very different from userland.
> 
> At that point, your previous argument about attracting more people
> who're already used to / like C++ breaks down, because it wouldn't be
> that C++ as usual C++ devs know it (IIRC, STL is integral part of the
> standard), but just the core lang plus some very custom template lib.

It's not that the argument breaks down, it's that it applies to a 
smaller, but still greater than 0, target audience. There are plenty of 
C++ programmers out there that disable the STL on purpose: game 
developers, automotive engineers that I know and so on. You're going to 
be hard-pressed to find concrete numbers, but the fact that the EASTL 
and ETL exist shows the proliferation of non-STL C++ and that the STL 
itself is not an integral part of C++. I recommend you check out ETL 
specifically, I'm sure you'll be amazed at how much functionality it 
has, especially geared for the embedded world.

> 
>> Although compiler-specific, C++20 has enabled implementing RTTI 
>> without RTTI as well as (partial) reflection. 
> 
> You name it: compiler specific.
> 
> Is it even specified how this exactly looks at binary level, and methods
> to control the exact binary data structures ?
> 
> The least thing's need to implement such things is some pointer or tag
> inside each struct/object instance - this would change struct layouts!
> Note that we often use structs to reflect HW specific data structures,
> so we'd need a way to have exact control over this. And then we need to
> be very careful on which instances have RTTI and which ones don't.
> I see debugging nightmares on the horizon ...

I could be convinced that RTTI of any sort is just a bad idea in the 
kernel. It is one of the things that is first to be disabled in embedded 
C++ usage, alongside exceptions. Still, it has its uses even in those 
areas, but that's outside of the scope of this proposal I think.

> 
>> On top of increasing the binary size, 
> 
> That's also a huge problem:
> 
> Templates in general have the strong tendency of producing lots of
> duplicated code. That's what they're designed for: expressing similar
> things (that have to be different on binary level) by the same
> generic source code.
> 
> It might be possible to write them in a way they don't increase binary
> size, but that's not entirely trivial, and so the actual gain of all
> of that becomes questionable again.

Hmm, explicit template instantiations are an 'easy' fix to taming the 
code bloat, but any use of templates is going to mean _some_ extra code 
generation. I do not have any concrete Kernel examples here, but I'm 
sure there are switch/case statements somewhere in there that can be 
optimized away by using templates. For those, the question is: code 
bloat or run-time performance?

> 
>> this then becomes a discussion on what requirements the kernel puts on 
>> compilers, as I'm sure that the kernel needs to be compiled for 
>> architectures which have a less than stellar conformance to the C++ 
>> specification. 
> Indeed. Also think about embedded environments, where folks can't easily
> upgrade toolchains (e.g. due regulative constraints)
> 

This argument also applies against using Rust and is directly opposed to 
modern security practices. Updating to the latest version for 
OS/compilers/libraries etc is pretty much a given since UN R155 and UN 
R156 came into effect. Though those apply only to automotive so far, the 
Cyber Resilience Act is going to force manufacturers of all kinds to 
adhere to better security. There is definitely a whole debate we can 
have just on the impacts of these regulations and what that should mean, 
but I've already written a lot ;)

Cheers,
Michael de Lang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ