[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250513001751.71660-1-kuniyu@amazon.com>
Date: Mon, 12 May 2025 17:17:36 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <bluca@...ian.org>
CC: <alexander@...alicyn.com>, <brauner@...nel.org>,
<daan.j.demeyer@...il.com>, <daniel@...earbox.net>, <davem@...emloft.net>,
<david@...dahead.eu>, <edumazet@...gle.com>, <horms@...nel.org>,
<jack@...e.cz>, <jannh@...gle.com>, <kuba@...nel.org>, <kuniyu@...zon.com>,
<lennart@...ttering.net>, <linux-fsdevel@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <linux-security-module@...r.kernel.org>,
<me@...dnzj.com>, <netdev@...r.kernel.org>, <oleg@...hat.com>,
<pabeni@...hat.com>, <viro@...iv.linux.org.uk>, <zbyszek@...waw.pl>
Subject: Re: [PATCH v6 4/9] coredump: add coredump socket
From: Luca Boccassi <bluca@...ian.org>
Date: Mon, 12 May 2025 11:58:54 +0100
> On Mon, 12 May 2025 at 09:56, Christian Brauner <brauner@...nel.org> wrote:
> >
> > Coredumping currently supports two modes:
> >
> > (1) Dumping directly into a file somewhere on the filesystem.
> > (2) Dumping into a pipe connected to a usermode helper process
> > spawned as a child of the system_unbound_wq or kthreadd.
> >
> > For simplicity I'm mostly ignoring (1). There's probably still some
> > users of (1) out there but processing coredumps in this way can be
> > considered adventurous especially in the face of set*id binaries.
> >
> > The most common option should be (2) by now. It works by allowing
> > userspace to put a string into /proc/sys/kernel/core_pattern like:
> >
> > |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
> >
> > The "|" at the beginning indicates to the kernel that a pipe must be
> > used. The path following the pipe indicator is a path to a binary that
> > will be spawned as a usermode helper process. Any additional parameters
> > pass information about the task that is generating the coredump to the
> > binary that processes the coredump.
> >
> > In the example core_pattern shown above systemd-coredump is spawned as a
> > usermode helper. There's various conceptual consequences of this
> > (non-exhaustive list):
> >
> > - systemd-coredump is spawned with file descriptor number 0 (stdin)
> > connected to the read-end of the pipe. All other file descriptors are
> > closed. That specifically includes 1 (stdout) and 2 (stderr). This has
> > already caused bugs because userspace assumed that this cannot happen
> > (Whether or not this is a sane assumption is irrelevant.).
> >
> > - systemd-coredump will be spawned as a child of system_unbound_wq. So
> > it is not a child of any userspace process and specifically not a
> > child of PID 1. It cannot be waited upon and is in a weird hybrid
> > upcall which are difficult for userspace to control correctly.
> >
> > - systemd-coredump is spawned with full kernel privileges. This
> > necessitates all kinds of weird privilege dropping excercises in
> > userspace to make this safe.
> >
> > - A new usermode helper has to be spawned for each crashing process.
> >
> > This series adds a new mode:
> >
> > (3) Dumping into an abstract AF_UNIX socket.
> >
> > Userspace can set /proc/sys/kernel/core_pattern to:
> >
> > @address SO_COOKIE
> >
> > The "@" at the beginning indicates to the kernel that the abstract
> > AF_UNIX coredump socket will be used to process coredumps. The address
> > is given by @address and must be followed by the socket cookie of the
> > coredump listening socket.
> >
> > The socket cookie is used to verify the socket connection. If the
> > coredump server restarts or crashes and someone recycles the socket
> > address the kernel will detect that the address has been recycled as the
> > socket cookie will have necessarily changed and refuse to connect.
>
> This dynamic/cookie prefix makes it impossible to use this with socket
> activation units. The way systemd-coredump works is that every
> instance is an independent templated unit, spawned when there's a
> connection to the private socket. If the path was fixed, we could just
> reuse the same mechanism, it would fit very nicely with minimal
> changes.
Note this version does not use prefix. Now it requires users to
just pass the socket cookie via core_pattern so that the kernel
can verify the peer.
>
> But because you need a "server" to be permanently running, this means
> socket-based activation can no longer work, and systemd-coredump must
> switch to a persistently-running mode.
The only thing for systemd to do is assign a cookie after socket creation.
As long as systemd hold the file descriptor of the socket, you don't need
a dedicated "server" running permanently, and the fd can be passed around
to a spawned/activated process.
> This is a severe degradation of
> functionality, will continuously waste CPU/memory resources for no
> good reasons, and makes the whole thing more fragile and complex, as
> if there are any issues with this server, you start losing core files.
> And honestly I don't really see the point? Setting the pattern is a
> privileged operation anyway. systemd manages the socket with a socket
> unit and again that's privileged already.
>
> Could we drop this cookie prefix and go back to the previous version
> (v5), please? Or if there is some specific non-systemd use case in
> mind that I am not aware of, have both options, so that we can use the
> simpler and more straightforward one with systemd-coredump.
Powered by blists - more mailing lists