linux-kernel - Re: [PATCH v4 04/11] net: reserve prefix

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <ac9c25f0-9979-44ee-bcd7-74539aa8f1b5@iogearbox.net>
Date: Fri, 9 May 2025 10:07:37 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Christian Brauner <brauner@...nel.org>,
 Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: alexander@...alicyn.com, bluca@...ian.org, daan.j.demeyer@...il.com,
 davem@...emloft.net, david@...dahead.eu, edumazet@...gle.com,
 horms@...nel.org, jack@...e.cz, jannh@...gle.com, kuba@...nel.org,
 lennart@...ttering.net, linux-fsdevel@...r.kernel.org,
 linux-kernel@...r.kernel.org, me@...dnzj.com, netdev@...r.kernel.org,
 oleg@...hat.com, pabeni@...hat.com, viro@...iv.linux.org.uk,
 zbyszek@...waw.pl
Subject: Re: [PATCH v4 04/11] net: reserve prefix

On 5/9/25 7:54 AM, Christian Brauner wrote:
> On Thu, May 08, 2025 at 02:47:45PM -0700, Kuniyuki Iwashima wrote:
>> From: Christian Brauner <brauner@...nel.org>
>> Date: Thu, 8 May 2025 08:16:29 +0200
>>> On Wed, May 07, 2025 at 03:45:52PM -0700, Kuniyuki Iwashima wrote:
>>>> From: Christian Brauner <brauner@...nel.org>
>>>> Date: Wed, 07 May 2025 18:13:37 +0200
>>>>> Add the reserved "linuxafsk/" prefix for AF_UNIX sockets and require
>>>>> CAP_NET_ADMIN in the owning user namespace of the network namespace to
>>>>> bind it. This will be used in next patches to support the coredump
>>>>> socket but is a generally useful concept.
>>>>
>>>> I really think we shouldn't reserve address and it should be
>>>> configurable by users via core_pattern as with the other
>>>> coredump types.
>>>>
>>>> AF_UNIX doesn't support SO_REUSEPORT, so once the socket is
>>>> dying, user can't start the new coredump listener until it's
>>>> fully cleaned up, which adds unnecessary drawback.
>>>
>>> This really doesn't matter.
>>>
>>>> The semantic should be same with other types, and the todo
>>>> for the coredump service is prepare file (file, process, socket)
>>>> that can receive data and set its name to core_pattern.
>>>
>>> We need to perform a capability check during bind() for the host's
>>> coredump socket. Otherwise if the coredump server crashes an
>>> unprivileged attacker can simply bind the address and receive all
>>> coredumps from suid binaries.
>>
>> As I mentioned in the previous thread, this can be better
>> handled by BPF LSM with more fine-grained rule.
>>
>> 1. register a socket with its name to BPF map
>> 2. check if the destination socket is registered at connect
>>
>> Even when LSM is not availalbe, the cgroup BPF prog can make
>> connect() fail if the destination name is not registered
>> in the map.
>>
>>> This is also a problem for legitimate coredump server updates. To change
>>> the coredump address the coredump server must first setup a new socket
>>> and then update core_pattern and then shutdown the old coredump socket.
>>
>> So, for completeness, the server should set up a cgroup BPF
>> prog to route the request for the old name to the new one.
>>
>> Here, the bpf map above can be reused to check if the socket
>> name is registered in the map or route to another socket in
>> the map.
>>
>> Then, the unprivileged issue below and the non-dumpable issue
>> mentioned in the cover letter can also be resolved.
>>
>> The server is expected to have CAP_SYS_ADMIN, so BPF should
>> play a role.
> 
> This has been explained by multiple people over the course of this
> thread already. It is simply not acceptable for basic kernel
> functionality to be unsafe without the use of additional separate
> subsystems. It is not ok to require bpf for a core kernel api to be
> safely usable. It's irrelevant whether that's for security or cgroup
> hooks. None of which we can require.

As much as I like BPF, but I agree with Christian here that we should
not rely on other subsystems in addition, which might even be compiled
out in some cases where coredumps are needed (e.g. embedded).