linux-kernel - Re: [PATCH 1/6] tools/nolibc: add support for waitid()

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241223081319.GA3840@1wt.eu>
Date: Mon, 23 Dec 2024 09:13:19 +0100
From: Willy Tarreau <w@....eu>
To: Thomas Weißschuh <linux@...ssschuh.net>
Cc: Shuah Khan <shuah@...nel.org>, Paul Walmsley <paul.walmsley@...ive.com>,
        Palmer Dabbelt <palmer@...belt.com>, Albert Ou <aou@...s.berkeley.edu>,
        linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
        linux-riscv@...ts.infradead.org, Zhangjin Wu <falcon@...ylab.org>
Subject: Re: [PATCH 1/6] tools/nolibc: add support for waitid()

Hi Thomas!

On Sun, Dec 22, 2024 at 12:39:01PM +0100, Thomas Weißschuh wrote:
> > Maybe it will be time for us to run an overall audit of arch-dependent
> > syscalls we currently have, to make sure that the common ones continue
> > to work fine there (and waitpid() definitely is as common a syscall as
> > open() since it's the good old and portable one).
> 
> Isn't this what nolibc-test is already doing?

My concern is that it might be progressively going away from this if
we replace some standard syscalls with new ones that are cross-arch.

> Or do you also want to compare it to non-current kernel versions?

I mean that we progressively replace old posix calls with new cross arch
ones in the system (e.g. open->openat, waitpid->waitid etc) and normally
it's a libc's role to preserve application-level compatibility by
maintaining the mapping between standard ones and specific ones so that
applications relying on standard ones continue to work, and that was one
of the original goals of nolibc.

I have nothing against missing some calls in newly added architectures,
of course, but when I'm seeing for example that we switch some of the
lower layer tests to use a pipe because some call was not present, I
tend to think that maybe we should first define what is the minimal set
of working syscalls that the nolibc-test program requires to be usable
on any arch.

In the current case, we seem to have to arbiter between pipe() and
lseek() support for basic nolibc-test support. But maybe a new arch will
be added for which it will be the opposite choice between the two. We
may very well require both of them to work if needed, or either, at the
risk of delaying support of a specific arch in the future, but that's
fine.

Second we should have a new look at all our supported calls and check if
some of them are present while the legacy calls they're supposed to
replace is missing (which would be perfectly possible). For example if
we had implemented waitpid() much later, it would have been perfectly
possible that we'd only implement waitid() and miss waitpid() that
applications expect.

Honestly it's not a particularly interesting job to do. That's why I'm
mostly saying that we should just keep that in mind to be careful with
new additions.

> In general the special rv32 syscalls are not really
> architecture-dependent, they just dropped the "legacy" ones, especially
> all using 32bit timestamps.

I understand, and when adding a new arch we need to start with something.
I just think that we should consider that for a new arch to switch from
"in progress" to "working", it would require the legacy ones working on
other archs to work on that one as well. My concern is that early boot
tools would only build on certain archs but not all when all of them are
supposed to be in a working state. When it fails everywhere that's fine,
it just means we're missing some calls and the user is welcome to submit
a patch. But when the user only tests on, say, x86 and arm, and someone
relies on that to package kernels and discovers late that it fails on
riscv for example, that's a problem. Note that I'm just making up examples,
and not designating any particular issue.

Maybe it would be convenient to maintain a support matrix for the syscalls
we currently support. It could look something like:

   waitpid()   x86: native
               arm: native
               riscv32: via waitid()
               foobar: not yet

   open()      ...

etc. I could try to work on such a thing if you're interested as well, but
not now as I don't have the time at the moment.

Cheers,
Willy