linux-kernel - Re: Explicitly defining the userspace API

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87levefvgz.fsf@catern.com>
Date:   Fri, 06 May 2022 16:59:26 +0000 (UTC)
From:   Spencer Baugh <sbaugh@...ern.com>
To:     Greg KH <gregkh@...uxfoundation.org>
Cc:     linux-api@...r.kernel.org, linux-kernel@...r.kernel.org,
        marcin@...zkiewicz.com.pl, torvalds@...ux-foundation.org,
        arnd@...db.de
Subject: Re: Explicitly defining the userspace API

Greg KH <gregkh@...uxfoundation.org> writes:
> On Wed, Apr 20, 2022 at 04:15:25PM +0000, Spencer Baugh wrote:
>> 
>> Linux guarantees the stability of its userspace API, but the API
>> itself is only informally described, primarily with English prose.  I
>> want to add an explicit, authoritative machine-readable definition of
>> the Linux userspace API.
>> 
>> As background, in a conventional libc like glibc, read(2) calls the
>> Linux system call read, passing arguments in an architecture-specific
>> way according to the specific details of read.
>> 
>> The details of these syscalls are at best documented in manpages, and
>> often defined only by the implementation.  Anyone else who wants to
>> work with a syscall, in any way, needs to duplicate all those details.
>> 
>> So the most basic definition of the API would just represent the
>> information already present in SYSCALL_DEFINE macros: the C types of
>> arguments and return values.  More usefully, it would describe the
>> formats of those arguments and return values: that the first argument
>> to read is a file descriptor rather than an arbitrary integer, and
>> what flags are valid in the flags argument of openat, and that open
>> returns a file descriptor.  A step beyond that would be describing, in
>> some limited way, the effects of syscalls; for example, that read
>> writes into the passed buffer the number of bytes that it returned.
>
> So how would you define read() in this format in a way that has not
> already been attempted in the past?

I don't know about any attempts at doing this in the past (other than
what's already been mentioned in this thread - e.g. SYSCALL_DEFINE),
what do you have in mind?

> How are you going to define a format that explains functionality in a
> way that is not just the implementation in the end?

Lots of information can be expressed just with more specific types on
the function signature, even with regular C types.  No need to expose
the implementation in any way.

For example, accept4's signature is:

SYSCALL_DEFINE4(accept4, int, fd, struct sockaddr __user *, upeer_sockaddr,
		int __user *, upeer_addrlen, int, flags)

Here, fd and flags are the same type and have nothing to distinguish
them.  But, purely as an example, not suggesting exactly this, but one
could have:

typedef int user_fd_t;
typedef int accept_flags_t;

SYSCALL_DEFINE4(accept4, user_fd_t, fd, struct sockaddr __user *, upeer_sockaddr,
		int __user *, upeer_addrlen, accept_flags_t, flags)

Then a user could parse this SYSCALL_DEFINE and know that fd and flags
have different types with different possible valid values. user_fd_t
would be used by many different syscalls, accept_flags_t just by this.

With just this, the user of this information would still need to know
what user_fd and accept_flags are.  The next step would be describing
the valid values for accept_flags.  Unfortunately that's not something
that the C type system alone can express, but again purely as an
example, but one could have something like:

FLAGS_DEFINE(accept_flags, int,
  SOCK_CLOEXEC,
  SOCK_NONBLOCK)

Then a user could parse this FLAGS_DEFINE and know what the range of
valid values for accept_flags_t is.  This could also be used in the
kernel; for example, FLAGS_DEFINE could generate an accept_flags_valid
function, usable in accept4 as:

if (!accept_flags_valid(flags))
	return -EINVAL;

As for describing the buffer-writing behavior of read like I mentioned
before, here's a sketch of what that maybe could look like.  The current
signature of read is:

SYSCALL_DEFINE3(read, unsigned int, fd, char __user *, buf, size_t, count)

One could imagine adding a type to the return value and changing this to
something like:

#define bytes_written_or_error(written_buffer) int
#define writable_user_buf(size_of_buffer) char __user *

SYSCALL_DEFINE3_RET(bytes_written_or_error(buf),
                    read, unsigned int, fd,
                    writable_user_buf(count), buf, size_t, count)

A user could parse this and know at least partially how read uses the
passed-in buffer, without having to look at the implementation.

Just for the sake of mentioning it, one could also imagine static
analysis which checks the kernel implementation against these
more-detailed types, which could catch bugs.  But I'm not necessarily
proposing doing that - this is useful on its own even if it's not
checked by static analysis.

>> One step in this direction is Documentation/ABI, which specifies the
>> stability guarantees for different userspace APIs in a semi-formal
>> way.  But it doesn't specify the actual content of those APIs, and it
>> doesn't cover individual syscalls at all.
>
> The content is described in Documentation/ABI/ entries, where do you see
> that missing?

I meant that it doesn't describe the content of the APIs in a
machine-readable way.  (It's still very useful of course!)

> And you are correct, that place does not describe syscalls, or other
> user/kernel interfaces that predate sysfs.
>
> good luck!

Thank you!