[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJgzZoqUV6gSfgCWbfe6oSH5k9qt30gpJ0epa+w78WQUgTCqNQ@mail.gmail.com>
Date: Thu, 15 May 2025 16:26:03 -0400
From: enh <enh@...gle.com>
To: "H. Peter Anvin" <hpa@...or.com>
Cc: Arnd Bergmann <arnd@...db.de>, LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>, libc-alpha@...rceware.org,
linux-arch@...r.kernel.org
Subject: Re: Metalanguage for the Linux UAPI
On Thu, May 15, 2025 at 4:05 PM H. Peter Anvin <hpa@...or.com> wrote:
>
> OK, so this is something I have been thinking about for quite a while.
> It would be a quite large project, so I would like to hear people's
> opinions on it before even starting.
>
> We have finally succeeded in divorcing the Linux UAPI from the general
> kernel headers, but even so, there are a lot of things in the UAPI that
> means it is not possible for an arbitrary libc to use it directly; for
> example "struct termios" is not the glibc "struct termios", but
> redefining it breaks the ioctl numbering unless the ioctl headers are
> changed as well, and so on. However, other libcs want to use the struct
> termios as defined in the kernel, or, more likely, struct termios2.
bionic is a ("the only"?) libc that tries to not duplicate _anything_
and always defer to the uapi headers. we have quite an extensive list
of hacks we need to apply to rewrite the uapi headers into something
directly usable (and a lot of awful python to apply those hacks):
https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/kernel/tools/defaults.py
a lot are just name collisions ("you say 'class', my c++ compiler says
wtf?!"), but there are a few "posix and linux disagree"s too. (other
libcs that weren't linux-only from day one might have more conflicts,
such as a comically large sigset_t, say :-) )
but i think most if not all of that could be fixed upstream, given the will?
(though some c programmers do still get upset if told they shouldn't
use c++ keywords as identifiers, i note that the uapi headers _were_
recently fixed to avoid a c extension that's invalid c++. thanks,
anyone involved in that who's reading this!)
> Furthermore, I was looking further into how C++ templates could be used
> to make user pointers inherently safe and probably more efficient, but
> ran into the problem that you really want to be able to convert a
> user-tagged structure to a structure with "safe-user-tagged" members
> (after access_ok), which turned out not to be trivially supportable even
> after the latest C++ modernizations (without which I don't consider C++
> viable at all; I would not consider versions of C++ before C++17 worthy
> of even looking at; C++20 preferred.)
(/me assumes you're just trolling linus with this.)
> And it is not just generation of in-kernel versus out-of-kernel headers
> that is an issue (which we have managed to deal with pretty well.) There
> generally isn't enough information in C headers alone to do well at
> creating bindings for other languages, *especially* given how many
> constants are defined in terms of macros.
(yeah, while i think the _c_ [and c++] problems could be solved much
more easily, solving the swift/rust/golang duplication of all that
stuff is a whole other thing. i'd try to sign up one of those
languages' library's maintainers before investing too much in having
another representation of the uapi though...)
> The use of C also makes it hard to mangle the headers for user space.
> For example, glibc has to add __extension__ before anonymous struct or
> union members in order to be able to compile in strict C90 mode.
(again, that one seems easily fixable upstream.)
> I have been considering if it would make sense to create more of a
> metalanguage for the Linux UAPI. This would be run through a more
> advanced preprocessor than cpp written in C and yacc/bison. (It could
> also be done via a gcc plugin or a DWARF parser, but I do not like tying
> this to compiler internals, and DWARF parsing is probably more complex
> and less versatile.)
>
> It could thus provide things like "true" constants (constexpr for C++11
> or C23, or enums), bitfield macro explosions and so on, depending on
> what the backend user would like: namespacing, distributed enumerations,
> and assembly offset constants, and even possibly syscall stubs.
(given a clean slate that wouldn't be terrible, but you get a lot of
#if nonsense. though the `#define foo foo` trick lets you have the
best of both worlds [at some cost to compile time].)
> There is of course no reason such a generator couldn't be used for
> kernel-only headers at some point, but I am concentrating on the
>
> Another major motivation is to be able to include one named struct
> anonymously inside another, without having to repeat the definition.
> (This is not supported in standard C or GNU C; MS C supports it as an
> extension, and I have requested that it be added into GNU C which would
> also allow it to be used with __extension__, and perhaps get folded into
> a future C standard since it would now fit the criterion of more than
> one implementation; however, the runway for being able to use that in
> UAPI headers is quite long.)
>
> I obviously want to keep a C-like syntax for this, which is a major
> reason for using a parser like yacc/bison.
>
> I have done such a project in the past, with some good success. That
> being said, the requirements for the Linux UAPI language are obviously
> much more complex. A few things I have considered are wanting to be able
> to namespace constants or, more or less equivalently, create
> enumerations in bits and pieces (consider ioctl constants, for example)
> and have them coalesce into a single definition if appropriate for the
> target language.
>
> Speaking of ioctl constants: one of the current problems is that a fair
> number of ioctl constants do not have the size/type annotations, and
> perhaps worse, it is impossible to tell from just the numeric value
> (since _IOC_NONE expands to 0, an _IO() ioctl ends up having no type
> information at all.) This is something that *definitely* ought to be
> added, even if a certain backend cannot preserve that information
>
> Thoughts?
>
> -hpa
>
Powered by blists - more mailing lists