[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250311-umkreisen-versorgen-6388fdf4024e@brauner>
Date: Tue, 11 Mar 2025 13:02:23 +0100
From: Christian Brauner <brauner@...nel.org>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: aleksandr.mikhalitsyn@...onical.com, arnd@...db.de, bluca@...ian.org,
cgroups@...r.kernel.org, davem@...emloft.net, edumazet@...gle.com, hannes@...xchg.org,
kuba@...nel.org, leon@...nel.org, linux-kernel@...r.kernel.org, mkoutny@...e.com,
mzxreary@...inter.de, netdev@...r.kernel.org, pabeni@...hat.com, shuah@...nel.org,
tj@...nel.org, willemb@...gle.com
Subject: Re: [PATCH net-next 0/4] Add getsockopt(SO_PEERCGROUPID) and fdinfo
API to retreive socket's peer cgroup id
On Tue, Mar 11, 2025 at 12:33:48AM -0700, Kuniyuki Iwashima wrote:
> From: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@...onical.com>
> Date: Sun, 9 Mar 2025 14:28:11 +0100
> > 1. Add socket cgroup id and socket's peer cgroup id in socket's fdinfo
>
> Why do you want to add yet another racy interface ?
>
>
> > 2. Add SO_PEERCGROUPID which allows to retrieve socket's peer cgroup id
> > 3. Add SO_PEERCGROUPID kselftest
> >
> > Generally speaking, this API allows race-free resolution of socket's peer cgroup id.
> > Currently, to do that SCM_CREDENTIALS/SCM_PIDFD -> pid -> /proc/<pid>/cgroup sequence
> > is used which is racy.
>
> Few more words about the race (recycling pid ?) would be appreciated.
>
> I somewhat assumed pid is not recycled until all of its pidfd are
> close()d, but sounds like no ?
No, that would allow starving the kernel of pid numbers.
pidfds don't pin struct task_struct for a multitude of reasons similar
to how cred->peer or scm->pid don't stash a task_struct but a struct pid.
>
>
> >
> > As we don't add any new state to the socket itself there is no potential locking issues
> > or performance problems. We use already existing sk->sk_cgrp_data.
> >
> > We already have analogical interfaces to retrieve this
> > information:
> > - inet_diag: INET_DIAG_CGROUP_ID
> > - eBPF: bpf_sk_cgroup_id
> >
> > Having getsockopt() interface makes sense for many applications, because using eBPF is
> > not always an option, while inet_diag has obvious complexety and performance drawbacks
> > if we only want to get this specific info for one specific socket.
>
> If it's limited to the connect()ed peer, I'd add UNIX_DIAG_CGROUP_ID
> and UNIX_DIAG_PEER_CGROUP_ID instead. Then also ss can use that easily.
Powered by blists - more mailing lists