lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250430-work-coredump-socket-v1-0-2faf027dbb47@kernel.org>
Date: Wed, 30 Apr 2025 13:05:00 +0200
From: Christian Brauner <brauner@...nel.org>
To: linux-fsdevel@...r.kernel.org
Cc: "David S. Miller" <davem@...emloft.net>, 
 Alexander Viro <viro@...iv.linux.org.uk>, 
 Daan De Meyer <daan.j.demeyer@...il.com>, 
 David Rheinsberg <david@...dahead.eu>, Eric Dumazet <edumazet@...gle.com>, 
 Jakub Kicinski <kuba@...nel.org>, Jan Kara <jack@...e.cz>, 
 Kuniyuki Iwashima <kuniyu@...zon.com>, 
 Lennart Poettering <lennart@...ttering.net>, 
 Luca Boccassi <bluca@...ian.org>, Mike Yuan <me@...dnzj.com>, 
 Oleg Nesterov <oleg@...hat.com>, Paolo Abeni <pabeni@...hat.com>, 
 Simon Horman <horms@...nel.org>, 
 Zbigniew Jędrzejewski-Szmek <zbyszek@...waw.pl>, 
 linux-kernel@...r.kernel.org, netdev@...r.kernel.org, 
 Christian Brauner <brauner@...nel.org>
Subject: [PATCH RFC 0/3] coredump: support AF_UNIX sockets

Coredumping currently supports two modes:

(1) Dumping directly into a file somewhere on the filesystem.
(2) Dumping into a pipe connected to a usermode helper process
    spawned as a child of the system_unbound_wq or kthreadd.

For simplicity I'm mostly ignoring (1). There's probably still some
users of (1) out there but processing coredumps in this way can be
considered adventurous especially in the face of set*id binaries.

The most common option should be (2) by now. It works by allowing
userspace to put a string into /proc/sys/kernel/core_pattern like:

        |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h

The "|" at the beginning indicates to the kernel that a pipe must be
used. The path following the pipe indicator is a path to a binary that
will be spawned as a usermode helper process. Any additional parameters
pass information about the task that is generating the coredump to the
binary that processes the coredump.

In this case systemd-coredump is spawned as a usermode helper. There's
various conceptual consequences of this (non-exhaustive list):

- systemd-coredump is spawned with file descriptor number 0 (stdin)
  to the read-end of the pipe. All other file descriptors are closed.
  That specifically includes 1 (stdout) and 2 (stderr). This has already
  caused bugs because userspace assumed that this cannot happen (Whether
  or not this is a sane assumption is irrelevant.).

- systemd-coredump will be spawned as a child of system_unbound_wq. So
  it is not a child of any userspace process and specifically not a
  child of PID 1 so it cannot be waited upon and is in general a weird
  hybrid upcall.

- systemd-coredump is spawned highly privileged as it is spawned with
  full kernel credentials requiring all kinds of weird privilege
  dropping excercises in userspaces.

This adds another mode:

(3) Dumping into a AF_UNIX socket.

Userspace can set /proc/sys/kernel/core_pattern to:

        :/run/coredump.socket

The ":" at the beginning indicates to the kernel that an AF_UNIX socket
is used to process coredumps. The task generating the coredump simply
connects to the socket and writes the coredump into the socket.

Userspace can get a stable handle on the task generating the coredump by
using the SO_PEERPIDFD socket option. SO_PEERPIDFD uses the thread-group
leader pid stashed during connect(). Even if the task generating the
coredump is a subthread in the thread-group the pidfd of the
thread-group leader is a reliable stable handle. Userspace that's
interested in the credentials of the specific thread that crashed can
use SCM_PIDFD to retrieve them.

The pidfd can be used to safely open and parse /proc/<pid> of the task
and it can also be used to retrieve additional meta information via the
PIDFD_GET_INFO ioctl().

This will allow userspace to not have to rely on usermode helpers for
processing coredumps and thus to stop having to handle super privileged
coredumping helpers.

This is easy to test:

(a) coredump processing (we're using socat):

    > cat coredump_socket.sh
    #!/bin/bash
    
    set -x
    
    sudo bash -c "echo ':/tmp/stream.sock' > /proc/sys/kernel/core_pattern"
    socat --statistics unix-listen:/tmp/stream.sock,fork FILE:core_file,create,append,truncate

(b) trigger a coredump:

    user1@...alhost:~/data/scripts$ cat crash.c
    #include <stdio.h>
    #include <unistd.h>
    
    int main(int argc, char *argv[])
    {
            fprintf(stderr, "%u\n", (1 / 0));
            _exit(0);
    }

Signed-off-by: Christian Brauner <brauner@...nel.org>
---
Christian Brauner (3):
      coredump: massage format_corname()
      coredump: massage do_coredump()
      coredump: support AF_UNIX sockets

 fs/coredump.c | 241 ++++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 168 insertions(+), 73 deletions(-)
---
base-commit: 80e14080a00bc429a4ee440d17746a49867df663
change-id: 20250429-work-coredump-socket-87cc0f17729c


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ