[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAG48ez3i-QGb_b_Mo50rN-kROPh1vrZhr06sGkb1BYzmRDvffw@mail.gmail.com>
Date: Fri, 16 May 2025 16:26:15 +0200
From: Jann Horn <jannh@...gle.com>
To: Christian Brauner <brauner@...nel.org>
Cc: linux-fsdevel@...r.kernel.org, Daniel Borkmann <daniel@...earbox.net>,
Kuniyuki Iwashima <kuniyu@...zon.com>, Eric Dumazet <edumazet@...gle.com>, Oleg Nesterov <oleg@...hat.com>,
"David S. Miller" <davem@...emloft.net>, Alexander Viro <viro@...iv.linux.org.uk>,
Daan De Meyer <daan.j.demeyer@...il.com>, David Rheinsberg <david@...dahead.eu>,
Jakub Kicinski <kuba@...nel.org>, Jan Kara <jack@...e.cz>,
Lennart Poettering <lennart@...ttering.net>, Luca Boccassi <bluca@...ian.org>, Mike Yuan <me@...dnzj.com>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Zbigniew Jędrzejewski-Szmek <zbyszek@...waw.pl>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
linux-security-module@...r.kernel.org,
Alexander Mikhalitsyn <alexander@...alicyn.com>
Subject: Re: [PATCH v7 5/9] pidfs, coredump: add PIDFD_INFO_COREDUMP
On Fri, May 16, 2025 at 12:34 PM Christian Brauner <brauner@...nel.org> wrote:
> On Thu, May 15, 2025 at 10:56:26PM +0200, Jann Horn wrote:
> > Why can we safely put the pidfs reference now but couldn't do it
> > before the kernel_connect()? Does the kernel_connect() look up this
> > pidfs entry by calling something like pidfs_alloc_file()? Or does that
> > only happen later on, when the peer does getsockopt(SO_PEERPIDFD)?
>
> AF_UNIX sockets support SO_PEERPIDFD as you know. Users such as dbus or
> systemd want to be able to retrieve a pidfd for the peer even if the
> peer has already been reaped. To support this AF_UNIX ensures that when
> the peer credentials are set up (connect(), listen()) the corresponding
> @pid will also be registered in pidfs. This ensures that exit
> information is stored in the inode if we hand out a pidfd for a reaped
> task. IOW, we only hand out pidfds for reaped task if at the time of
> reaping a pidfs entry existed for it.
>
> Since we're setting coredump information on the pidfd here we're calling
> pidfs_register_pid() even before connect() sets up the peer credentials
> so we're sure that the coredump information is stored in the inode.
>
> Then we delay our pidfs_put_pid() call until the connect() took it's own
> reference and thus continues pinning the inode. IOW, connect() will also
> call pidfs_register_pid() but it will ofc just increment the reference
> count ensuring that our pidfs_put_pid() doesn't drop the inode.
Aah, so the call graph looks like this:
unix_stream_connect
prepare_peercred
pidfs_register_pid
[pidfs reference taken]
[point of no return]
init_peercred
[copies creds to socket, moving ref ownership]
copy_peercred
[copies creds from socket to peer socket, taking refs]
Thanks for explaining!
Powered by blists - more mailing lists