[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250515203540.85511-1-kuniyu@amazon.com>
Date: Thu, 15 May 2025 13:35:09 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <willemdebruijn.kernel@...il.com>
CC: <brauner@...nel.org>, <davem@...emloft.net>, <edumazet@...gle.com>,
<horms@...nel.org>, <kuba@...nel.org>, <kuni1840@...il.com>,
<kuniyu@...zon.com>, <netdev@...r.kernel.org>, <pabeni@...hat.com>,
<willemb@...gle.com>
Subject: Re: [PATCH v3 net-next 6/9] af_unix: Move SOCK_PASS{CRED,PIDFD,SEC} to struct sock.
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Date: Thu, 15 May 2025 14:44:14 -0400
> Kuniyuki Iwashima wrote:
> > As explained in the next patch, SO_PASSRIGHTS would have a problem
> > if we assigned a corresponding bit to socket->flags, so it must be
> > managed in struct sock.
> >
> > Mixing socket->flags and sk->sk_flags for similar options will look
> > confusing, and sk->sk_flags does not have enough space on 32bit system.
> >
> > Also, as mentioned in commit 16e572626961 ("af_unix: dont send
> > SCM_CREDENTIALS by default"), SOCK_PASSCRED and SOCK_PASSPID handling
> > is known to be slow, and managing the flags in struct socket cannot
> > avoid that for embryo sockets.
> >
> > Let's move SOCK_PASS{CRED,PIDFD,SEC} to struct sock.
> >
> > While at it, other SOCK_XXX flags in net.h are grouped as enum.
> >
> > Note that assign_bit() was atomic, so the writer side is moved down
> > after lock_sock() in setsockopt(), but the bit is only read once
> > in sendmsg() and recvmsg(), so lock_sock() is not needed there.
>
> Because the socket lock is already held there?
No, for example, scm_recv_unix() is called without lock_sock(),
but it's okay because reading a single bit is always a matter
of timing, when to snapshot the flag, (unless there is another
dependency or the bit is read more than once).
With this, write happens before/after the if block:
<-- write could happen here
lock_sock()
if (sk->sk_scm_credentials) {
do something
}
lock_unlock()
<-- or here (not related to logic)
but this is same without lock_sock() if the bit is read only
once:
<-- write could happen here
if (sk->sk_scm_credentials) {
do something <-- or here (not related to logic)
}
<-- or here (not related to logic)
So for SOCK_PASSXXX bits, lock_sock() prevents data-race
between writers as you pointed out, but it does nothing
for readers.
>
> What about getsockopt. And the one READ_ONCE in unix_accept.
And this is same for getsockopt().
Regarding unix_accept(), I used READ_ONCE() here to snapshot
all the flags, but given the value is bit for each, this is
also the matter of timing to snapshot the values.
Also, in the next patch, reading sk->sk_scm_recv_flags will be
done under lock_sock(), so it's done without READ_ONCE().
Powered by blists - more mailing lists