lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <feb98a0f-8d17-495c-b556-b4fe19446d5d@zytor.com>
Date: Thu, 15 May 2025 13:04:52 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Arnd Bergmann <arnd@...db.de>, LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        libc-alpha@...rceware.org, linux-arch@...r.kernel.org
Subject: Metalanguage for the Linux UAPI

OK, so this is something I have been thinking about for quite a while. 
It would be a quite large project, so I would like to hear people's 
opinions on it before even starting.

We have finally succeeded in divorcing the Linux UAPI from the general 
kernel headers, but even so, there are a lot of things in the UAPI that 
means it is not possible for an arbitrary libc to use it directly; for 
example "struct termios" is not the glibc "struct termios", but 
redefining it breaks the ioctl numbering unless the ioctl headers are 
changed as well, and so on. However, other libcs want to use the struct 
termios as defined in the kernel, or, more likely, struct termios2.

Furthermore, I was looking further into how C++ templates could be used 
to make user pointers inherently safe and probably more efficient, but 
ran into the problem that you really want to be able to convert a 
user-tagged structure to a structure with "safe-user-tagged" members 
(after access_ok), which turned out not to be trivially supportable even 
after the latest C++ modernizations (without which I don't consider C++ 
viable at all; I would not consider versions of C++ before C++17 worthy 
of even looking at; C++20 preferred.)

And it is not just generation of in-kernel versus out-of-kernel headers 
that is an issue (which we have managed to deal with pretty well.) There 
generally isn't enough information in C headers alone to do well at 
creating bindings for other languages, *especially* given how many 
constants are defined in terms of macros.

The use of C also makes it hard to mangle the headers for user space. 
For example, glibc has to add __extension__ before anonymous struct or 
union members in order to be able to compile in strict C90 mode.

I have been considering if it would make sense to create more of a 
metalanguage for the Linux UAPI. This would be run through a more 
advanced preprocessor than cpp written in C and yacc/bison. (It could 
also be done via a gcc plugin or a DWARF parser, but I do not like tying 
this to compiler internals, and DWARF parsing is probably more complex 
and less versatile.)

It could thus provide things like "true" constants (constexpr for C++11 
or C23, or enums), bitfield macro explosions and so on, depending on 
what the backend user would like: namespacing, distributed enumerations, 
and assembly offset constants, and even possibly syscall stubs.

There is of course no reason such a generator couldn't be used for 
kernel-only headers at some point, but I am concentrating on the

Another major motivation is to be able to include one named struct 
anonymously inside another, without having to repeat the definition. 
(This is not supported in standard C or GNU C; MS C supports it as an 
extension, and I have requested that it be added into GNU C which would 
also allow it to be used with __extension__, and perhaps get folded into 
a future C standard since it would now fit the criterion of more than 
one implementation; however, the runway for being able to use that in 
UAPI headers is quite long.)

I obviously want to keep a C-like syntax for this, which is a major 
reason for using a parser like yacc/bison.

I have done such a project in the past, with some good success. That 
being said, the requirements for the Linux UAPI language are obviously 
much more complex. A few things I have considered are wanting to be able 
to namespace constants or, more or less equivalently, create 
enumerations in bits and pieces (consider ioctl constants, for example) 
and have them coalesce into a single definition if appropriate for the 
target language.

Speaking of ioctl constants: one of the current problems is that a fair 
number of ioctl constants do not have the size/type annotations, and 
perhaps worse, it is impossible to tell from just the numeric value 
(since _IOC_NONE expands to 0, an _IO() ioctl ends up having no type 
information at all.) This is something that *definitely* ought to be 
added, even if a certain backend cannot preserve that information

Thoughts?

	-hpa


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ