[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez0r4A7iMXzBBdPiHWycYSAGSm7VFWULCqKQPXoBKFWpEw@mail.gmail.com>
Date: Wed, 21 May 2025 02:54:58 +0200
From: Jann Horn <jannh@...gle.com>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: stephen@...workplumber.org, alexander@...alicyn.com, brauner@...nel.org,
daan.j.demeyer@...il.com, daniel@...earbox.net, davem@...emloft.net,
david@...dahead.eu, edumazet@...gle.com, horms@...nel.org, jack@...e.cz,
kuba@...nel.org, lennart@...ttering.net, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-security-module@...r.kernel.org,
luca.boccassi@...il.com, me@...dnzj.com, netdev@...r.kernel.org,
oleg@...hat.com, pabeni@...hat.com, serge@...lyn.com, viro@...iv.linux.org.uk,
zbyszek@...waw.pl
Subject: Re: [PATCH v8 0/9] coredump: add coredump socket
On Wed, May 21, 2025 at 2:42 AM Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
> From: Stephen Hemminger <stephen@...workplumber.org>
> Date: Tue, 20 May 2025 12:28:38 -0700
> > On Fri, 16 May 2025 13:25:27 +0200
> > Christian Brauner <brauner@...nel.org> wrote:
> >
> > > Coredumping currently supports two modes:
> > >
> > > (1) Dumping directly into a file somewhere on the filesystem.
> > > (2) Dumping into a pipe connected to a usermode helper process
> > > spawned as a child of the system_unbound_wq or kthreadd.
> > >
> > > For simplicity I'm mostly ignoring (1). There's probably still some
> > > users of (1) out there but processing coredumps in this way can be
> > > considered adventurous especially in the face of set*id binaries.
> > >
> > > The most common option should be (2) by now. It works by allowing
> > > userspace to put a string into /proc/sys/kernel/core_pattern like:
> > >
> > > |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
> > >
> > > The "|" at the beginning indicates to the kernel that a pipe must be
> > > used. The path following the pipe indicator is a path to a binary that
> > > will be spawned as a usermode helper process. Any additional parameters
> > > pass information about the task that is generating the coredump to the
> > > binary that processes the coredump.
> > >
> > > In the example core_pattern shown above systemd-coredump is spawned as a
> > > usermode helper. There's various conceptual consequences of this
> > > (non-exhaustive list):
> > >
> > > - systemd-coredump is spawned with file descriptor number 0 (stdin)
> > > connected to the read-end of the pipe. All other file descriptors are
> > > closed. That specifically includes 1 (stdout) and 2 (stderr). This has
> > > already caused bugs because userspace assumed that this cannot happen
> > > (Whether or not this is a sane assumption is irrelevant.).
> > >
> > > - systemd-coredump will be spawned as a child of system_unbound_wq. So
> > > it is not a child of any userspace process and specifically not a
> > > child of PID 1. It cannot be waited upon and is in a weird hybrid
> > > upcall which are difficult for userspace to control correctly.
> > >
> > > - systemd-coredump is spawned with full kernel privileges. This
> > > necessitates all kinds of weird privilege dropping excercises in
> > > userspace to make this safe.
> > >
> > > - A new usermode helper has to be spawned for each crashing process.
> > >
> > > This series adds a new mode:
> > >
> > > (3) Dumping into an AF_UNIX socket.
> > >
> > > Userspace can set /proc/sys/kernel/core_pattern to:
> > >
> > > @/path/to/coredump.socket
> > >
> > > The "@" at the beginning indicates to the kernel that an AF_UNIX
> > > coredump socket will be used to process coredumps.
> > >
> > > The coredump socket must be located in the initial mount namespace.
> > > When a task coredumps it opens a client socket in the initial network
> > > namespace and connects to the coredump socket.
> >
> >
> > There is a problem with using @ as naming convention.
> > The starting character of @ is already used to indicate abstract
> > unix domain sockets in some programs like ss.
> > And will the new coredump socekt allow use of abstrace unix
> > domain sockets?
>
> The coredump only works with the pathname socket, so ideally
> the prefix should be '/', but it's same with the direct-file
> coredump. We can distinguish the socket by S_ISSOCK() though.
The path lookups work very differently between COREDUMP_SOCK and
COREDUMP_FILE - they are interpreted relative to different namespaces,
and they run with different privileges, and they do different format
string interpretation. I think trying to determine dynamically whether
the path refers to a socket or to a nonexistent location at which we
should create a file (or a preexisting file we should clobber) would
not be practical, partly for these reasons.
Also, fundamentally, if we have the choice between letting userspace
be explicit about what it wants, or trying to guess userspace's intent
from the kernel, I think we should always go for being explicit.
So I guess it could be reasonable to bikeshed the prefix letter and
turn '@' into some other character that is not overloaded with another
meaning in this context, like '>'; but I don't think we should be
changing the overall approach because of this.
Powered by blists - more mailing lists