lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <73010511-a804-4cf4-a5c1-1d08e3f324c5@app.fastmail.com>
Date: Fri, 07 Nov 2025 14:33:12 +0100
From: "Arnd Bergmann" <arnd@...db.de>
To: "Yuan Tan" <tanyuan@...ylab.org>,
 "Masahiro Yamada" <masahiroy@...nel.org>,
 "Nathan Chancellor" <nathan@...nel.org>,
 "Palmer Dabbelt" <palmer@...belt.com>, linux-kbuild@...r.kernel.org,
 linux-riscv@...ts.infradead.org
Cc: Linux-Arch <linux-arch@...r.kernel.org>, linux-kernel@...r.kernel.org,
 i@...kray.me, "Zhangjin Wu" <falcon@...ylab.org>, ronbogo@...look.com,
 z1652074432@...il.com, lx24@....ynu.edu.cn
Subject: Re: [PATCH v2 0/8] dce, riscv: Unused syscall trimming with PUSHSECTION and
 conditional KEEP()

On Tue, Nov 4, 2025, at 03:21, Yuan Tan wrote:

>> Sorry for the late reply — this patchset really wore me out, and I only just
>> recovered.  Thank you very much for your feedback!

Sorry to hear this has been stressful for you. It's an unfortunate
aspect of the way we work that sometimes 

> On 10/15/2025 12:47 AM, Arnd Bergmann wrote:
>> On Wed, Oct 15, 2025, at 08:16, Yuan Tan wrote:
>> Thanks a lot for your work on this. I think it is indeed valuable to
>> be able to optimize kernels with a smaller subset of system calls for
>> known workloads, and have as much dead code elimination as possible.
>>
>> However, I continue to think that the added scripting with a known
>> set of syscall names is fundamentally the wrong approach to get to
>> this list: This adds complexity to the build process in one of
>> the areas that is already too complicated, and it duplicates what
>> we can already do with Kconfig for a subset of the system calls.
>>
>> I think the way we should configure the set of syscalls instead is
>> to add more Kconfig symbols guarded by CONFIG_EXPERT that turn
>> classes of syscalls on or off. You have obviously done the research
>> to come up with a list of used/unused entry points for one or more
>> workloads. Can you share those lists?
>
> Regarding your suggestion to use Kconfig to control which system calls are
> included or excluded, perhaps we could take inspiration from systemd's
> classification approach. For example, systemd groups syscalls into categories
> like[1]:
>
> @aio @basic-io @chown @clock @cpu-emulation @debug @file-system
>
> and so on.

I think many of the categories already naturally align with the
structure of the kernel source code, so maintaining them naturally comes
out of the build system.

More importantly, turning off parts of the kernel on a per-file
basis tends to work better for eliminating the entire block
of code because only removing the syscall entry still leaves
references to functions and global data structures from initcalls
and exported functions.

> However, if we go down this route, we would need to continuously maintain and
> update these categories whenever Linux introduces new system calls. I' m not
> sure whether that would be an ideal long-term approach.

If we can (at least roughly) align the categories between the kernel and the
systemd classification, that would at least make it easier to maintain
the systemd ones.

> For reference, here is the list of syscalls required to run Lighttpd.
>
> execve set_tid_address mount write brk mmap munmap getuid getgid getpid
> clock_gettime getcwd fcntl fstat read dup3 socket setsockopt bind listen
> rt_sigaction rt_sigprocmask newfstatat prlimit64 epoll_create1 epoll_ctl pipe2
> epoll_pwait accept4 getsockopt recvfrom shutdown writev getdents64 openat close
>
> We've tested it successfully on QEMU + initramfs, and I can share the
> deployment script if anyone would like to reproduce the setup.

Thanks for the list! Is this a workload you are interested in actually
optimizing for deployment, or just something you used as a simple test
environment?

I see three types of syscalls in your list above:

1. essential ones that are basically always needed
2. socket interfaces (already optional)
3. epoll (already optional)

The first two sets are clearly going to have more syscalls in
them that are usually used in combination with the others:
If we provide read, write and writev, we should also provide readv,
and if we provide socket/bind/listen/recvfrom, we also likely want
accept/connect/sendto and probably recvmsg/sendmsg.

Starting with your set of syscalls and those closely related
ones, as well as the set of syscalls that already have a
Kconfig option, we should be able to find the set of syscalls
that are unconditionally enabled but could be optional.
If you have the chance, could you compile that list?
I might also have a list, but probably not in the next week.

The next step after that I think is to measure the impact
of turning off those remaining ones in a configuration that
has the existing symbols (e.g. sysvipc, futex, compat_32bit_time,
...) disabled already.

Side note: I'm a  bit surprised to see fstat() in the list, since riscv
should only really support newfstat().

> Also, I noticed that there haven't been any comments so far on the later
> patches introducing the PUSHSECTION macro.  I' m a bit concerned about how
> people perceive this part.

I don't have a strong opinion on this part.

     Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ