linux-kernel - Re: [PATCH 00/45] C++: Convert the kernel to C++

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8634v2jh9s.fsf@aarsen.me>
Date: Fri, 12 Jan 2024 22:35:34 +0100
From: Arsen Arsenović <arsen@...sen.me>
To: David Howells <dhowells@...hat.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH 00/45] C++: Convert the kernel to C++


David Howells <dhowells@...hat.com> writes:

> Arsen Arsenović <arsen@...sen.me> wrote:
>
>> >  (2) Constructors and destructors.  Nests of implicit code makes the code less
>> >      obvious, and the replacement of static initialisation with constructor
>> >      calls would make the code size larger.
>>
>> This also disallows the primary benefit of C++ (RAII), though.  A lot of
>> static initialization can be achieved using constexpr and consteval,
>> too.
>
> Okay, let me downgrade that to "I wouldn't allow it at first".  The primary
> need for destructors, I think, is exception handling.

I'm not sure I agree, the amount of 'goto err' constructs in the kernel
seems to indicate otherwise to me.  This feels like the exact same code,
except more error prone.

> And don't get me wrong, I like the idea of exception handling - so
> many bugs come because we mischeck or forget to check the error.

C++ also provides possible alternative avenues for solving such
problems, such as, for instance, an expected type with monadic
operations: https://en.cppreference.com/w/cpp/utility/expected

IIRC, using std::expected in managarm (where we previously used the IMO
far less nice Frigg expected type) is what initially prompted me to
start enabling the use of a lot of libstdc++ in kernel contexts, and
indeed, it is enabled there:
https://gcc.gnu.org/cgit/gcc/tree/libstdc++-v3/include/Makefile.am#n25

>> It is incredibly useful to be able to express resource ownership in
>> terms of automatic storage duration.
>
> Oh, indeed, yes - but you also have to be careful:
>
>  (1) You don't always want to wait till the end of the scope before releasing
>      resources.

One could move a resource out, or call a function akin to the 'reset()'
method of std::unique_ptr.

>  (2) Expressing ownership of something like a lock so that it is automatically
>      undone may require extra memory is currently unnecessary:
>
> 	struct foo {
> 		struct rwsem sem;
> 	};
>
>
> 	myfunc(struct foo *foo)
> 	{
> 		...
> 		struct foo_shared_lock mylock(foo->sem);
> 		...
> 	}
>
>      This looks like a nice way to automatically take and hold a lock, but I
>      don't think it can be done without storing the address of the semaphore
>      in mylock - something that isn't strictly necessary since we can find sem
>      from foo.

The compiler can often get rid of it.  Here's an example:
https://godbolt.org/z/1W7bnYY7a

Simple enough wrapper classes like these combined with a modern
compilers IPA and inlining can really do magic :-)

>  (3) We could implement a magic pointer class that automatically does
>      reference wangling (kref done right) - but we would have to be very
>      careful using it because we want to do the minimum number of atomic ops
>      on its refcount that we can manage, firstly because atomic ops are slow
>      and secondly because the atomic counter must not overflow.

With move semantics, this could be quite effective and general.  The
shared_ptr from the standard library, for instance, won't bump
reference counts if moved.  And temporaries are automatically moved.

You could make the class move-only so that *all* reference incrementing
requires a method call (and hence, is clear and obvious), while still
permitting auto-decrementing and preventing reference leakage.

>> >  (5) Function overloading (except in special inline cases).
>>
>> Generic code, another significant benefit of C++, requires function
>> overloading, though.
>
> I know.  But I was thinking that we might want to disable name mangling if we
> can so as not to bloat the size of the kernel image.  That said, I do like the
> idea of being able to have related functions of the same name with different
> arguments rather than having to name each one differently.

Hmm, I can understand the symbol table size being an issue.

>> >  (7) 'class', 'private', 'namespace'.
>>
>> 'class' does nothing that struct doesn't do, private and namespace serve
>> simply for encapsulation, so I don't see why banning these is useful.
>
> Namespaces would lead to image bloat as they make the symbols bigger.
> Remember, the symbol list uses up unswappable memory.

Ah, I was not aware of this restriction of the kernel (my understanding
was that the symbol table is outside of the kernel image).  That poses a
problem, yes.  I wonder if a big part of the symbol table (or even the
entirety of it) could be dropped from the kernel.  I must say, I do not
know why the kernel has it, so I cannot speak on this issue.

> We use class and private a lot as symbols already, so to get my stuff to
> compile I had to #define them.  Granted there's nothing intrinsically
> different about classes and we could rename every instance of the symbol in
> the kernel first.

I see.  That is quite understandable then, especially if temporary.

> When it comes to 'private', actually, I might withdraw my objection to it: it
> would help delineate internal fields - but we would then have to change
> out-of-line functions that use it to be members of the class - again
> potentially increasing the size of the symbol table.

This is what I like about it too.

>> >  (8) 'virtual'.  Don't want virtual base classes, though virtual function
>> >      tables might make operations tables more efficient.
>>
>> Virtual base classes are seldom useful, but I see no reason to
>> blanket-ban them (and I suspect you'll never notice that they're not
>> banned).
>
> You can end up increasing the size of your structure as you may need multiple
> virtual method pointer tables - and we have to be very careful about that as
> some structures (dentry, inode and page for example) we have a *lot* of
> instances of in a running kernel.

I retract what I said about virtual classes - I had, indeed, forgotten
about that issue (but, again, I doubt anyone will miss them ;-) ).

>> >  (2) Direct assignment of pointers to/from void* isn't allowed by C++, though
>> >      g++ grudgingly permits it with -fpermissive.  I would imagine that a
>> >      compiler option could easily be added to hide the error entirely.
>>
>> This should never be useful.
>
> It's not a matter of whether it should be useful - we do this an awful lot and
> every case of assigning to/from a void pointer would require some sort of
> cast.

I see.  That could pose significant trouble.

Ideally, nearly all uses of void* could be lost sooner or later, as C++
has a more flexible (despite being stricter) type system.

Have a lovely day!

> David


--
Arsen Arsenović

Download attachment "signature.asc" of type "application/pgp-signature" (382 bytes)