[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250516205430.93517-1-kuniyu@amazon.com>
Date: Fri, 16 May 2025 13:54:21 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <willemdebruijn.kernel@...il.com>
CC: <brauner@...nel.org>, <davem@...emloft.net>, <edumazet@...gle.com>,
<horms@...nel.org>, <kuba@...nel.org>, <kuni1840@...il.com>,
<kuniyu@...zon.com>, <netdev@...r.kernel.org>, <pabeni@...hat.com>,
<willemb@...gle.com>
Subject: Re: [PATCH v4 net-next 7/9] af_unix: Inherit sk_flags at connect().
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Date: Fri, 16 May 2025 15:27:48 -0400
> Kuniyuki Iwashima wrote:
> > From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
> > Date: Fri, 16 May 2025 12:47:13 -0400
> > > Kuniyuki Iwashima wrote:
> > > > For SOCK_STREAM embryo sockets, the SO_PASS{CRED,PIDFD,SEC} options
> > > > are inherited from the parent listen()ing socket.
> > > >
> > > > Currently, this inheritance happens at accept(), because these
> > > > attributes were stored in sk->sk_socket->flags and the struct socket
> > > > is not allocated until accept().
> > > >
> > > > This leads to unintentional behaviour.
> > > >
> > > > When a peer sends data to an embryo socket in the accept() queue,
> > > > unix_maybe_add_creds() embeds credentials into the skb, even if
> > > > neither the peer nor the listener has enabled these options.
> > > >
> > > > If the option is enabled, the embryo socket receives the ancillary
> > > > data after accept(). If not, the data is silently discarded.
> > > >
> > > > This conservative approach works for SO_PASS{CRED,PIDFD,SEC}, but
> > > > would not for SO_PASSRIGHTS; once an SCM_RIGHTS with a hung file
> > > > descriptor was sent, it'd be game over.
> > >
> > > Code LGTM, hence my Reviewed-by.
> > >
> > > Just curious: could this case be handled in a way that does not
> > > require receivers explicitly disabling a dangerous default mode?
> > >
> > > IIUC the issue is the receiver taking a file reference using fget_raw
> > > in scm_fp_copy from __scm_send, and if that is the last ref, it now
> > > will hang the receiver process waiting to close this last ref?
> > >
> > > If so, could the unwelcome ref be detected at accept, and taken from
> > > the responsibility of this process? Worst case, assigned to some
> > > zombie process.
> >
> > I had the same idea and I think it's doable but complicated.
> >
> > We can't detect such a hung fd until we actually do close() it (*), so
> > the workaround at recvmsg() would be always call an extra fget_raw()
> > and queue the fd to another task (kthread or workqueue?).
> >
> > The task can't release the ref until it can ensure that the receiver
> > of fd has close()d it, so the task will need to check ref == 1
> > preodically.
> >
> > But, once the task gets stuck, we need to add another task, or all
> > fds will be leaked for a while.
> >
> >
> > (*) With bpf lsm, we will be able to inspect such fd at sendmsg() but
> > still can't know if it will really hang at close() especially if it's of
> > NFS.
> > https://github.com/q2ven/linux/commit/a9f03f88430242d231682bfe7c19623b7584505a
>
> Thanks. Yeah, I had not thought it through as much, but this is
> definitely complex. Not sure even what the is_hung condition would be
> exactly.
>
> Given that not wanting to receive untrusted FDs from untrusted peers
> is quite common, perhaps a likely eventual follow-on to this series is
> a per-netns sysctl to change the default.
Makes sense, I'll add a follow-up patch in the LSM series.
Powered by blists - more mailing lists